{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Denison CS181/DA210 Homework\n", "\n", "Before you turn this problem in, make sure everything runs as expected. This is a combination of **restarting the kernel** and then **running all cells** (in the menubar, select Kernel$\\rightarrow$Restart And Run All).\n", "\n", "Make sure you fill in any place that says YOUR CODE HERE or \"YOUR ANSWER HERE\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import os.path\n", "import pandas as pd\n", "\n", "datadir = \"publicdata\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q1** Make the following into a pandas data frame, assigning it to variable df.\n", "\n", " {'foo': ['one','one','one','two','two','two'],\n", " 'bar': ['A', 'B', 'C', 'A', 'B', 'C'],\n", " 'baz': [1, 2, 3, 4, 5, 6]}\n", " \n", "If the values one and two from column foo should head columns (so it takes more than one row to interpret a single observation), and the values themselves come from the baz column, what transformation/reshaping operation should be used to obtain a tidy version of this data? Include your answer as a comment in your code cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "4e2d4dea1183f10839c078be952a71b9", "grade": false, "grade_id": "cell-c93bf4fe3e267518", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "df" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "19bce4344d5140387d06852a38f0c886", "grade": true, "grade_id": "cell-886fdd47df433623", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "assert True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q2** What parameter arguments would be needed for this operation to do its job?\n" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "nbgrader": { "cell_type": "markdown", "checksum": "1bf48e089009d3a369fc518f29cfd309", "grade": true, "grade_id": "cell-5c28bca5fc3b4cae", "locked": false, "points": 1, "schema_version": 3, "solution": true, "task": false } }, "source": [ "YOUR ANSWER HERE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q3** Perform the operation and assign the result to df2. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "664fcac01da42e4b295e83bbb5e78547", "grade": false, "grade_id": "cell-ad420ee9bba5fbf4", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "df2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "c115cff7668885293610495c85f001cf", "grade": true, "grade_id": "cell-44ff0cf82825bf8b", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "assert True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q4** Make the following into a pandas data frame. Assign it to df.\n", "\n", " {'A': {0: 'a', 1: 'b', 2: 'c'},\n", " 'B': {0: 2, 1: 4, 2: 6},\n", " 'C': {0: 1, 1: 3, 2: 5},\n", " 'D': {0: 1, 1: 2, 2: 4}}\n", "\n", "Suppose further that we have determined that columns B and D are really *values* of a *variable* called X. What transformation/reshaping operation should be used to obtain a tidy version of this data? Enter your answer as a comment in the code cell where you create the data frame." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "c69bb17262abb803a439ad147f8c2402", "grade": false, "grade_id": "cell-2a7ada090c525e4c", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "df" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "d97735707b70056c2a6efa6b39d21dec", "grade": true, "grade_id": "cell-69a35617816d2082", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "assert True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q5** At a minimum, what parameter arguments would be needed for this operation to do its job?\n" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "nbgrader": { "cell_type": "markdown", "checksum": "cfad4e482913a16ef11ca401df5afddd", "grade": true, "grade_id": "cell-dc38059135162aae", "locked": false, "points": 1, "schema_version": 3, "solution": true, "task": false } }, "source": [ "YOUR ANSWER HERE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q6** Perform the operation and assign the result to df2. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "72230ee488a135d1cfe6b39d5581cfe3", "grade": false, "grade_id": "cell-409fcf1eacadcf13", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "df2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "5c65c2510e5fe1c20ae6b0b132c83e29", "grade": true, "grade_id": "cell-bc651d3241e7affb", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "assert True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q7** Consider the file ratings.csv. It has columns for first name, last name, RatingA, used for rating a particular restaurant (A), and RatingB, used for rating a different restaurant (B). The name of a \"rater\" should be a single variable. The particular restaurants are *values* of the data set. Transform the given dataset into a tidy data set, naming it ratings_tidy. Do not give the new data set a row label index." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "9bcc55e98ecc146b25a208c9b50203b5", "grade": false, "grade_id": "cell-453406ce5397fdb5", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "ratings_tidy\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "639eddf11f74ddb364b3af20c3fce52a", "grade": true, "grade_id": "cell-73c2bce4601a51de", "locked": true, "points": 3, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "assert True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q8** Consider the file restaurants_gender.csv, that has aggregated other data and whose rows map from an id, restaurant, and gender to an average rating. So, relative to this aggregation, the data is tidy as it stands. Pivot the restaurants_gender data into a matrix presentation with restaurant down one axis (as a row-label index) and gender across the other axis (as column label Index), a form that might make for good presentation. Store the result as rest_mat." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "6bc71ca164621dd37f7009908fdd23da", "grade": false, "grade_id": "cell-997d394463f13ce0", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "rest_mat" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "e384b820ef107a2a96efdec50ad90ed6", "grade": true, "grade_id": "cell-7a1d795ffbb5dc03", "locked": true, "points": 3, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "assert True\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" } }, "nbformat": 4, "nbformat_minor": 4 }