{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Denison CS181/DA210 Homework\n", "\n", "Before you turn this problem in, make sure everything runs as expected. This is a combination of **restarting the kernel** and then **running all cells** (in the menubar, select Kernel$\\rightarrow$Restart And Run All).\n", "\n", "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "import os.path\n", "import pandas as pd\n", "\n", "datadir = \"publicdata\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q1** In the data directory you will find `members.csv`, with (fake) information on a number of individuals in Ohio. We will use this for the next several exercises. Read this dataset into a `pandas DataFrame` using `read_csv`. Name it `members0`, and do not include an index." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "0a2f8713c50f72645198651cc6fc3085", "grade": false, "grade_id": "cell-dcba7fd85d6129f6", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "members0.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "2dfd04928d95b68558a0beafb38dd37a", "grade": true, "grade_id": "cell-f303315087a54035", "locked": true, "points": 1, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing Cell\n", "\n", "assert True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q2** Repeat the above, but now do include an index, by specifying `index_col` in the constructor. Name your `DataFrame` `members`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "2264d125067033a6c20ac534d4e441d7", "grade": false, "grade_id": "cell-6920bdc40cf31840", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "members.head()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "0ad7a5cca82dc1c1da1bc9e20b8449ca", "grade": true, "grade_id": "cell-e841d3a08206426d", "locked": true, "points": 1, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing Cell\n", "\n", "assert True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q3** Write a projection from `members` that will isolate just the `'Phone'` column into a variable `phone_series`. What type are the values in this column? Include the answer as a comment in your solution cell." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "a546bb12c25d9ee208db35419b4c42b0", "grade": false, "grade_id": "cell-44b6c8151853d9da", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "phone_series" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "9f3e48feac189a537246aec430ced978", "grade": true, "grade_id": "cell-73008dca051ae781", "locked": true, "points": 1, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing Cell\n", "\n", "assert True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q4** Write a selection on `members` that will isolate just those rows where the Phone number starts with a `614` area code. Selection involves picking which rows will be shown. Name the result `cbus`.\n", "\n", "In order to complete this:\n", "\n", "1. Write a lambda function that will, given a string, isolate the first three characters and compare these to `'614'`.\n", "2. Apply the lambda function to the column you isolated in the last question.\n", "3. Use the result as an index to the `pandas DataFrame`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "7d5fe644b5fae41196f8ddc86ae272a6", "grade": false, "grade_id": "cell-b886afb32e3e5ffc", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "cbus" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "8f7e6eed350f8b8f7e691238a137699e", "grade": true, "grade_id": "cell-b93c6719fddc412b", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing Cell\n", "\n", "assert True" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q5** Split up the column `Name` into two different columns, `FName` and `LName`. We follow a similar process for this example as we did for selecting rows.\n", "\n", "1. Write a lambda function that will, given a string, split on space and select only the first element in the resultant list.\n", "2. Apply the lambda function to the `Name` column, and save the result.\n", "3. Access the dataframe text using a new column name (`FName`), and assign the result from step 2.\n", "4. Acquire the Last name from the Name column and create a new column in the data frame called LName." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "d9125eb3f6f15085027f28b536ec07e2", "grade": false, "grade_id": "cell-ff271df66aa02ad9", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "fdbac9aafb11868d5d060941b3929e52", "grade": true, "grade_id": "cell-c0fdceafe85eadaf", "locked": true, "points": 3, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing Cell\n", "\n", "assert True" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" } }, "nbformat": 4, "nbformat_minor": 4 }