{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Denison CS181/DA210 Homework\n", "\n", "Before you turn this problem in, make sure everything runs as expected. This is a combination of **restarting the kernel** and then **running all cells** (in the menubar, select Kernel$\\rightarrow$Restart And Run All).\n", "\n", "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# SQL Column Operations Exercises" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "2c15bf49fe13507b473a89ebc6ffdea5", "grade": false, "grade_id": "cell-b9abcf27cf7faf8f", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "import pandas as pd\n", "import os\n", "import os.path\n", "import json\n", "import sys\n", "import importlib\n", "\n", "module_dir = \"../../modules\"\n", "module_path = os.path.abspath(module_dir)\n", "if not module_path in sys.path:\n", " sys.path.append(module_path)\n", "\n", "import dbutil\n", "importlib.reload(dbutil)\n", "\n", "%load_ext sql" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Instructions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set User Credentials\n", "\n", "> **Edit `creds.json` to reflect your mysql user and password**\n", "\n", "> This must be done **prior** to executing the following cell \n", "\n", "In general, you will be able to choose whether you are using the remote MySQL database or the SQLite database(s) by setting the `dbsource` variable to `\"mysql\"` or `\"sqlite\"` respectively. The function `dbutil.db_cstring` function computes a *connection string* for the chosen `dbsource` using the information in the `creds.json` file. If the last argument to this function is present, the generated connection string uses that datbase as superceding the name of the database in `creds.json`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dbsource = \"sqlite\"\n", "db = \"book\"\n", "cstring = dbutil.db_cstring(dbsource, \"creds.json\", \".\", db)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(\"Connection string:\", cstring)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Establish Connection from Client to Server\n", "\n", "The cell below uses the Python variable constructed earlier to actually establish a connection with the server. \n", "\n", "If you cannot get a connection established, switch over to MySQL Workbench and try from that different client." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%sql $cstring" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the following cells, your only action is to, as usual, cut out the two lines, and to put a valid SQL statement as the **value** of string variable `query`. In each case, when you execute the cell, the query will be sent to the database management system, a result obtained, and the result converted into a `pandas` data frame, whose prefix is shown." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Problems" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "4e944b7523dbf73e4cd802065888bc8f", "grade": false, "grade_id": "cell-b5a29a8fb4da3eaa", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "**Q1** In reference to the table `indicators`, write a query to project code, year, gdp, pop, and find the GDP per capita of each country, call the new column `percapita`. Sort the data in descending value of `percapita`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "99fbac4717ede8d55b5fed59d98b5ea6", "grade": false, "grade_id": "cell-a8820c19cb5b39e4", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "query = \"\"\"\n", "\"\"\"\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "resultset = %sql $query\n", "resultdf = resultset.DataFrame()\n", "resultdf.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "bd3418a656cfc21a4fa32609675e68f8", "grade": true, "grade_id": "cell-db2e382596ee995c", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "assert len(resultdf) == 12862" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "0d626d401fe9f0bb3a6f02aeb17b19a6", "grade": false, "grade_id": "cell-ed1f774da763abd5", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "**Q2** In reference to the table `indicators`, select all columns, plus a new column named `new` that tells whether `exports` are larger than `imports`. Sort in descending order of gdp." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "0dac995b796a1b97be659ec94644af28", "grade": false, "grade_id": "cell-13fe542eba4f08bf", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "query = \"\"\"\n", "\"\"\"\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "resultset = %sql $query\n", "resultdf = resultset.DataFrame()\n", "resultdf.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "6d1e769f36811102be421a12d110a860", "grade": true, "grade_id": "cell-39bfaf689aa180e2", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "assert resultdf.shape == (12862,9)\n", "assert resultdf.loc[2,'new'] == 0.0" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Switching to the `school` Database\n", "\n", "The following cell, **assuming `dbsource` is defined from above**, creates a new connection string and establishes a new connection to the `school` database." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "db = \"school\"\n", "cstring = dbutil.db_cstring(dbsource, \"creds.json\", \".\", db)\n", "%sql $cstring" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "076fa58c876cb6da71818e0c711261e1", "grade": false, "grade_id": "cell-54e40869ca6221ad", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "**Q3** In reference to the `instructors` table in the `school` database, compute and project a single new column `full_name` obtained as the last name, followed by a comma and a space, followed by the first name. Put the names in alphabetical order." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "d0492dabc6630bf8264e9e8070bb9f56", "grade": false, "grade_id": "cell-e908a961094b059d", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "query = \"\"\"\n", "\"\"\"\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "resultset = %sql $query\n", "resultdf = resultset.DataFrame()\n", "resultdf.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "len(resultdf)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "2439ab266624ba1c6101239a5314a3e1", "grade": true, "grade_id": "cell-d2b715d1607c18bb", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "assert len(resultdf) == 292\n", "assert 'Aguilar, Stephen' in list(resultdf['full_name'])" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "a335933709930c2f8939f6a084bdb6b7", "grade": false, "grade_id": "cell-5e000af2e3dced2e", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "**Q4** In reference to the `courses` table in the `school` database, select a new column `catalog` obtained as the course subject, followed by a space, followed by the course number, followed by a colon and a space, followed by the course title. Only include distinct catalogue entries and don't include any NULL entries." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "3a8deac9a3266ef52e5e4cf48faa9a6d", "grade": false, "grade_id": "cell-92361a7de37ada44", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "%sql USE school\n", "query = \"\"\"\n", "\"\"\"\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "resultset = %sql $query\n", "resultdf = resultset.DataFrame()\n", "resultdf.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "6f7b210dce6169baf271bb776dac595c", "grade": true, "grade_id": "cell-47d860c2013a0c6f", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "assert len(resultdf) == 904\n", "assert 'WMST 101: Issues in Feminism' in list(resultdf['catalog'])" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }