Denison CS181/DA210 Homework

Before you turn this problem in, make sure everything runs as expected. This is a combination of restarting the kernel and then running all cells (in the menubar, select Kernel$\rightarrow$Restart And Run All).

Make sure you fill in any place that says YOUR CODE HERE or "YOUR ANSWER HERE".


In [ ]:
import os
import os.path
import pandas as pd

datadir = "publicdata"

Q1 In the data directory you will find members.csv, with (fake) information on a number of individuals in Ohio. We will use this for the next several exercises. Read this dataset into a pandas DataFrame using read_csv. Name it members0, and do not include an index.

In [ ]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()
members0.head()
In [ ]:
# Testing Cell

assert True

Q2 Repeat the above, but now do include an index, by specifying index_col in the constructor. Name your DataFrame members.

In [ ]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()
members.head()
In [ ]:
# Testing Cell

assert True

Q3 Write a projection from members that will isolate just the 'Phone' column into a variable phone_series. What type are the values in this column? Include the answer as a comment in your solution cell.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
phone_series
In [ ]:
# Testing Cell

assert True

Q4 Write a selection on members that will isolate just those rows where the Phone number starts with a 614 area code. Selection involves picking which rows will be shown. Name the result cbus.

In order to complete this:

  1. Write a lambda function that will, given a string, isolate the first three characters and compare these to '614'.
  2. Apply the lambda function to the column you isolated in the last question.
  3. Use the result as an index to the pandas DataFrame.
In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
cbus
In [ ]:
# Testing Cell

assert True

Q5 Split up the column Name into two different columns, FName and LName. We follow a similar process for this example as we did for selecting rows.

  1. Write a lambda function that will, given a string, split on space and select only the first element in the resultant list.
  2. Apply the lambda function to the Name column, and save the result.
  3. Access the dataframe text using a new column name (FName), and assign the result from step 2.
  4. Acquire the Last name from the Name column and create a new column in the data frame called LName.
In [ ]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
# Testing Cell

assert True