Before you turn this problem in, make sure everything runs as expected. First, restart the kernel (in the menubar, select Kernel$\rightarrow$Restart) and then run all cells (in the menubar, select Cell$\rightarrow$Run All).
Make sure you fill in any place that says YOUR CODE HERE
or "YOUR ANSWER HERE", as well as your name and collaborators below:
NAME = ""
COLLABORATORS = ""
import os
import os.path
import pandas as pd
datadir = "publicdata"
path = os.path.join(datadir, "topnames.csv")
topnames0 = pd.read_csv(path)
topnames = topnames0.set_index(['year', 'sex'])
names0 = topnames0.head(10)
names = topnames.head(10)
topnames0.info()
topnames.head()
topnames.shape
names
names.shape
topnames0.head()
topnames0.shape
names0
path = os.path.join(datadir, "indicators2016.csv")
ind0 = pd.read_csv(path)
ind = ind0.set_index('code')
ind0
ind
ind.info()
Single Column Subset (Projection)
With and without an Index on the DataFrame
Observations
- Series data type (not a DataFrame)
- data type of the column itself
- referencing elements within the column, depending on Index
- perform column-based math computations
- similar for logical operations, getting a vector of booleans
- function application (unary operation) on a column vector
Multi Column Projection
Variations:
- explicit list of desired columns
- in the limit, a list of length 1
- come back to later in general subsets of rows and columns