import pandas as pd
import os
import os.path
import json
def getmysql_creds(dirname=".",filename="creds.json"):
""" Using directory and filename parameters, open a credentials file
and obtain the five parts needed for a connection string to
a remote provider using the "mysql" dictionary within
an outer dictionary.
Return a scheme, server, user, password, and database
"""
assert os.path.isfile(os.path.join(dirname, filename))
with open(os.path.join(dirname, filename)) as f:
D = json.load(f)
mysql = D["mysql"]
database = None
if "database" in mysql:
database = mysql["database"]
return mysql["scheme"], mysql["server"], mysql["user"], mysql["pass"], database
def getsqlite_creds(dirname=".",filename="creds.json"):
""" Using directory and filename parameters, open a credentials file
and obtain the two parts needed for a connection string to
a local provider using the "sqlite" dictionary within
an outer dictionary.
Return a scheme and a dbfile
"""
assert os.path.isfile(os.path.join(dirname, filename))
with open(os.path.join(dirname, filename)) as f:
D = json.load(f)
sqlite = D["sqlite"]
return sqlite["scheme"], sqlite["dbdir"], sqlite["database"]
Working with databases gives us the opportunity to use an independent client as a tool to look at the tables of a database and even to create and try out SQL queries. This indpendent route allows us to separate the concerns of the Python programming that we do in Jupyter notebooks from problem solving data queries.
In this class, I tend to use one of two independent clients:
Go ahead and download and install at least the TablePlus program on your own machine.
We will not spend time configuring these now, but you should take note of the information needed to connect to remote MySQL databases and to local SQLite databases from the cells below, as this same information will be used to configure connection setups in both of these independent clients.
With a shared resource at a provider like a MySQL RDBMS, we need to use credentials to authenticate ourselves to the server, and need the logical location of the server itself.
For these notebooks, these are kept in a text file named 'creds.json', stored either in the same directory or in a data directory. For this notebook, this is stored in the same directory as the notebook.
creds.json
file and select Open With->Editor"nostudent"
) with the base part of your email address (i.e. without the .denison.edu
). "nostudent"
) with the base part of your email address. Your password on the mysql server, at present, is the same as your user. "hadoop.mathsci.denison.edu"
. "mysql+mysqlconnector"
."book"
.Make sure to use double quotes for strings ... this is JSON
, not Python, and we have to follow JSON syntax.
Once this is complete, execute the following cell to get these values into global variables.
scheme, server, user, password, database = getmysql_creds()
print("Server:", server)
print("MySQL user name:", user)
print("MySQL user password", password)
print("Database:", repr(database))
If you followed all the instructions at the initial install of Anaconda on your computer, these packages should already be installed on your machine.
Check for what SQL-related packages are installed: Click the +
in the upper left of Jupyter Lab and open a Terminal. If you have a mac, at the command prompt, type the following command (but without the dollar sign):
$ conda list | grep sql
I get the following:
ipython-sql 0.3.9 pypi_0 pypi
mysql-connector-c 6.1.11 hccea1a4_0
mysql-connector-python 8.0.18 py37h3febbb0_1
sqlalchemy 1.3.9 py37h1de35cc_0
sqlite 3.30.1 ha441bb4_0
sqlparse 0.3.0 pypi_0 pypi
If you have a PC, the above command will not work for you but you can still run the two commands below (again, without typing the dollar sign).
If mysql-connector-python
is not present:
$ conda install mysql-connector-python
Accept the defaults and install.
If ipython-sql
is not present:
$ pip install ipython-sql
Accept the defaults and install
This extension is provided by the ipython-sql
package installed previously, and allows the three forms of sql within Notebooks (code cells, not markdown cells)
%%sql
: Magic to make an entire multi-line cell be interpreted as an SQL to be sent as a request to a connected server.%sql
: Magic to allow a single line SQL, placed in its entirety to the right of the %sql
and the only thing in the cell.%sql
: Magic to allow the SQL command following the %sql
to be executed as part of a larger Python code block. This allows the integration of Python code and variables incorporated with sql requests.Execute the following cell to load the sql-magics extension
Note that if this cell is executed a second time within the same kernel, it will not reload. Normally this is fine, unless the sql magics module has changed.
%load_ext sql
template = '{}://{}:{}@{}/{}'
cstring = template.format(scheme, user, password, server, database if database is not None else "")
print("Connection string:", cstring)
The cell below uses the Python variable constructed earlier to actually establish a connection with the server. If when you execute this cell, you get a 'Connected'
message, you are ready to proceed. If not, then the most likely problem is one with the JSON specifications of your credentials. Much less likely is a problem with the user that was created in the MySQL Server.
If you cannot get a connection established, switch over to MySQL Workbench and try from that different client.
%sql $cstring
Try and execute the following line, which uses an "inline" sql magic to exectute a query to the database and retrieve the result.
%sql SELECT * from indicators0
scheme, dbdir, database = getsqlite_creds()
template = '{}:///{}/{}.db'
cstring = template.format(scheme, dbdir, database)
print("Connection string:", cstring)
The cell below uses the Python variable constructed earlier to actually establish a connection with the server. If you cannot get a connection established, switch over to TablePlus
and try from that different client.
%sql $cstring
Try and execute the following line, which uses an "inline" sql magic to exectute a query to the database and retrieve the result.
%sql SELECT * from indicators0