{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Denison CS181/DA210 Homework\n",
    "\n",
    "Before you turn this problem in, make sure everything runs as expected. This is a combination of **restarting the kernel** and then **running all cells** (in the menubar, select Kernel$\\rightarrow$Restart And Run All).\n",
    "\n",
    "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\"."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import os.path\n",
    "import pandas as pd\n",
    "\n",
    "datadir = \"publicdata\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Q1** This question deals with the tuburculosis dataset. At no point should you rearrange the order of the rows.\n",
    "\n",
    "1. Read `table6.csv` into a dataframe `df1`.\n",
    "2. Combine 'century' and 'yearDigits' into one column, 'year' (whose values are strings), then drop the two old columns. Use `copy()` to avoid modifying the original data frame. Store the result as `df1a`.\n",
    "3. Starting from `df1a`, split the column 'rate' into two new columns 'cases' (the number before the slash) and 'population' (the number after). After you're done, drop 'rate'. Store the result as `df1b`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "eff717fe378b975de146463896a3c037",
     "grade": false,
     "grade_id": "cell-eda84885c5aa2fe1",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# Solution cell\n",
    "\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "57656dee5eb6c7c519bd24dc13f40310",
     "grade": true,
     "grade_id": "cell-604a9ca2dedc8fe8",
     "locked": true,
     "points": 4,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# Testing cell\n",
    "\n",
    "assert df1.shape == (6,4)\n",
    "assert df1a.shape == (6,3)\n",
    "assert df1b.shape == (6,4)\n",
    "assert df1.iloc[2,3] == '37737/172006362'\n",
    "assert df1a.iloc[3,2] == '2000'\n",
    "assert df1b.iloc[4,3] == '1272915272'\n",
    "assert df1b.iloc[0,2] == \"745\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Q2** Read `us_rent_income.csv` into a dataframe (with \"GEOID\" as the index), then transform as needed to make it tidy. Store the result as `df_rent`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "98f7f4a7a3892610c5622e1a8552c611",
     "grade": false,
     "grade_id": "cell-5bba2b9d91e4bf79",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# Solution cell\n",
    "\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "99a5c2e77454339e73a2dddf81015491",
     "grade": true,
     "grade_id": "cell-d927d8a1051f9d9d",
     "locked": true,
     "points": 3,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# Testing cell\n",
    "\n",
    "assert(df_rent.shape == (52,4))\n",
    "assert(df_rent.iloc[0,0] == 24476.0)\n",
    "assert(df_rent.iloc[0,1] == 747.0)\n",
    "assert(df_rent.iloc[0,2] == 136.0)\n",
    "assert(df_rent.iloc[0,3] == 3.0)\n",
    "assert(df_rent.iloc[20,0] == 37147.0)\n",
    "assert(df_rent.iloc[31,1] == 809.0) "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Q3** Consider the data on religions and income, gathered by Pew Research Center and hosted at this link:\n",
    "\n",
    "https://github.com/chendaniely/pandas_for_everyone/blob/master/data/pew.csv\n",
    "\n",
    "The data is also available as `\"pew.csv\"` in the data folder.  In the markdown cell that follows, read the data into a DataFrame assigned to `df`.  In the subsequent markdown cell, answer the question: Is this data in tidy data form? Explain your answer."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "dc4c0884fecbd619066e9038f415ee6b",
     "grade": true,
     "grade_id": "cell-cefd70c125c26355",
     "locked": false,
     "points": 1,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# Solution cell\n",
    "\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "2fb2cbeec43eaf387c568aa7a5d8f366",
     "grade": true,
     "grade_id": "cell-75bc095bb1409b07",
     "locked": false,
     "points": 1,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "source": [
    "YOUR ANSWER HERE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Q4** Explore the data from the previous exercise, then **from the data** list the independent variable(s) and the dependent variable(s). Note: this data came from a survey of counting individuals based on their religion and their income category."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "markdown",
     "checksum": "480ef83aa1ff55c3a190fbc5624c29fa",
     "grade": true,
     "grade_id": "cell-3b610ab224c05354",
     "locked": false,
     "points": 1,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "source": [
    "YOUR ANSWER HERE"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Q5** Transform as needed to make it tidy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "f8ec55ae1b35557dd83263d5015f0785",
     "grade": false,
     "grade_id": "cell-63953f65aa520b07",
     "locked": false,
     "schema_version": 3,
     "solution": true,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# Solution cell\n",
    "\n",
    "# YOUR CODE HERE\n",
    "raise NotImplementedError()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "deletable": false,
    "editable": false,
    "nbgrader": {
     "cell_type": "code",
     "checksum": "a5acfce4ac3b370cbd5149e6d76b3b45",
     "grade": true,
     "grade_id": "cell-42c6a8c97256fdaa",
     "locked": true,
     "points": 2,
     "schema_version": 3,
     "solution": false,
     "task": false
    }
   },
   "outputs": [],
   "source": [
    "# Testing cell\n",
    "\n",
    "assert(df_rel.shape == (180,3))\n",
    "assert(df_rel.iloc[0,0] == \"Agnostic\")\n",
    "assert(df_rel.iloc[0,1] == \"<$10k\")\n",
    "assert(df_rel.iloc[0,2] == 27)\n",
    "assert(df_rel.iloc[41,0] == \"Evangelical Prot\")\n",
    "assert(df_rel.iloc[89,1] == \"$40-50k\")\n",
    "assert(df_rel.iloc[104,2] == 14)\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}