{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Denison CS181/DA210 Homework\n", "\n", "Before you turn this problem in, make sure everything runs as expected. This is a combination of **restarting the kernel** and then **running all cells** (in the menubar, select Kernel$\\rightarrow$Restart And Run All).\n", "\n", "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import os.path\n", "import pandas as pd\n", "\n", "datadir = \"publicdata\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q1** Make the following into a `pandas` data frame, assigning it to variable `df`.\n", "\n", " {'foo': ['one','one','one','two','two','two'],\n", " 'bar': ['A', 'B', 'C', 'A', 'B', 'C'],\n", " 'baz': [1, 2, 3, 4, 5, 6]}\n", " \n", "If the values `one` and `two` from column `foo` should head columns (so it takes more than one row to interpret a single observation), and the values themselves come from the `baz` column, what transformation/reshaping operation should be used to obtain a tidy version of this data? Include your answer as a comment in your code cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "4e2d4dea1183f10839c078be952a71b9", "grade": false, "grade_id": "cell-c93bf4fe3e267518", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "df" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "19bce4344d5140387d06852a38f0c886", "grade": true, "grade_id": "cell-886fdd47df433623", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "assert True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q2** What parameter arguments would be needed for this operation to do its job?\n" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "nbgrader": { "cell_type": "markdown", "checksum": "1bf48e089009d3a369fc518f29cfd309", "grade": true, "grade_id": "cell-5c28bca5fc3b4cae", "locked": false, "points": 1, "schema_version": 3, "solution": true, "task": false } }, "source": [ "YOUR ANSWER HERE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q3** Perform the operation and assign the result to `df2`. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "664fcac01da42e4b295e83bbb5e78547", "grade": false, "grade_id": "cell-ad420ee9bba5fbf4", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "df2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "c115cff7668885293610495c85f001cf", "grade": true, "grade_id": "cell-44ff0cf82825bf8b", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "assert True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q4** Make the following into a `pandas` data frame. Assign it to `df`.\n", "\n", " {'A': {0: 'a', 1: 'b', 2: 'c'},\n", " 'B': {0: 2, 1: 4, 2: 6},\n", " 'C': {0: 1, 1: 3, 2: 5},\n", " 'D': {0: 1, 1: 2, 2: 4}}\n", "\n", "Suppose further that we have determined that columns `B` and `D` are really *values* of a *variable* called `X`. What transformation/reshaping operation should be used to obtain a tidy version of this data? Enter your answer as a comment in the code cell where you create the data frame." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "c69bb17262abb803a439ad147f8c2402", "grade": false, "grade_id": "cell-2a7ada090c525e4c", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "df" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "d97735707b70056c2a6efa6b39d21dec", "grade": true, "grade_id": "cell-69a35617816d2082", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "assert True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q5** At a minimum, what parameter arguments would be needed for this operation to do its job?\n" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "nbgrader": { "cell_type": "markdown", "checksum": "cfad4e482913a16ef11ca401df5afddd", "grade": true, "grade_id": "cell-dc38059135162aae", "locked": false, "points": 1, "schema_version": 3, "solution": true, "task": false } }, "source": [ "YOUR ANSWER HERE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q6** Perform the operation and assign the result to `df2`. " ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "72230ee488a135d1cfe6b39d5581cfe3", "grade": false, "grade_id": "cell-409fcf1eacadcf13", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "df2" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "5c65c2510e5fe1c20ae6b0b132c83e29", "grade": true, "grade_id": "cell-bc651d3241e7affb", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "assert True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q7** Consider the file `ratings.csv`. It has columns for first name, last name, RatingA, used for rating a particular restaurant (A), and RatingB, used for rating a different restaurant (B). The name of a \"rater\" should be a single variable. The particular restaurants are *values* of the data set. Transform the given dataset into a tidy data set, naming it `ratings_tidy`. Do not give the new data set a row label index." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "9bcc55e98ecc146b25a208c9b50203b5", "grade": false, "grade_id": "cell-453406ce5397fdb5", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "ratings_tidy\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "639eddf11f74ddb364b3af20c3fce52a", "grade": true, "grade_id": "cell-73c2bce4601a51de", "locked": true, "points": 3, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "assert True\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q8** Consider the file `restaurants_gender.csv`, that has aggregated other data and whose rows map from an id, restaurant, and gender to an average rating. So, relative to this aggregation, the data is tidy as it stands. Pivot the `restaurants_gender` data into a matrix presentation with restaurant down one axis (as a row-label index) and gender across the other axis (as column label Index), a form that might make for good presentation. Store the result as `rest_mat`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "6bc71ca164621dd37f7009908fdd23da", "grade": false, "grade_id": "cell-997d394463f13ce0", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "rest_mat" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "e384b820ef107a2a96efdec50ad90ed6", "grade": true, "grade_id": "cell-7a1d795ffbb5dc03", "locked": true, "points": 3, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "assert True\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" } }, "nbformat": 4, "nbformat_minor": 4 }