Before you turn this problem in, make sure everything runs as expected. This is a combination of restarting the kernel and then running all cells (in the menubar, select Kernel$\rightarrow$Restart And Run All).
Make sure you fill in any place that says YOUR CODE HERE
or "YOUR ANSWER HERE".
import os
import os.path
import pandas as pd
datadir = "publicdata"
Q1 In the data directory you will find members.csv
, with (fake) information on a number of individuals in Ohio. We will use this for the next several exercises. Read this dataset into a pandas DataFrame
using read_csv
. Name it members0
, and do not include an index.
# Solution cell
# YOUR CODE HERE
raise NotImplementedError()
members0.head()
# Testing Cell
assert True
Q2 Repeat the above, but now do include an index, by specifying index_col
in the constructor. Name your DataFrame
members
.
# Solution cell
# YOUR CODE HERE
raise NotImplementedError()
members.head()
# Testing Cell
assert True
Q3 Write a projection from members
that will isolate just the 'Phone'
column into a variable phone_series
. What type are the values in this column? Include the answer as a comment in your solution cell.
# YOUR CODE HERE
raise NotImplementedError()
phone_series
# Testing Cell
assert True
Q4 Write a selection on members
that will isolate just those rows where the Phone number starts with a 614
area code. Selection involves picking which rows will be shown. Name the result cbus
.
In order to complete this:
'614'
.pandas DataFrame
.# YOUR CODE HERE
raise NotImplementedError()
cbus
# Testing Cell
assert True
Q5 Split up the column Name
into two different columns, FName
and LName
. We follow a similar process for this example as we did for selecting rows.
Name
column, and save the result.FName
), and assign the result from step 2.# Solution cell
# YOUR CODE HERE
raise NotImplementedError()
# Testing Cell
assert True