Before you turn this problem in, make sure everything runs as expected. First, restart the kernel (in the menubar, select Kernel$\rightarrow$Restart) and then run all cells (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says YOUR CODE HERE or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [ ]:
NAME = ""
COLLABORATORS = ""

In [ ]:
import os
import os.path
import pandas as pd

datadir = "publicdata"
In [ ]:
path = os.path.join(datadir, "topnames.csv")
topnames0 = pd.read_csv(path)
topnames = topnames0.set_index(['year', 'sex'])
names0 = topnames0.head(10)
names = topnames.head(10)
In [ ]:
topnames0.info()
In [ ]:
topnames.head()
In [ ]:
topnames.shape
In [ ]:
names
In [ ]:
names.shape
In [ ]:
topnames0.head()
In [ ]:
topnames0.shape
In [ ]:
names0
In [ ]:
path = os.path.join(datadir, "indicators2016.csv")
ind0 = pd.read_csv(path)
ind = ind0.set_index('code')
In [ ]:
ind0
In [ ]:
ind
In [ ]:
ind.info()

Single Column Subset (Projection)

With and without an Index on the DataFrame

Observations

  • Series data type (not a DataFrame)
  • data type of the column itself
  • referencing elements within the column, depending on Index
  • perform column-based math computations
  • similar for logical operations, getting a vector of booleans
  • function application (unary operation) on a column vector
In [ ]:
 

Multi Column Projection

Variations:

  • explicit list of desired columns
    • in the limit, a list of length 1
  • come back to later in general subsets of rows and columns
In [ ]:
 
In [ ]: