Denison CS181/DA210 Homework

Before you turn this problem in, make sure everything runs as expected. This is a combination of restarting the kernel and then running all cells (in the menubar, select Kernel$\rightarrow$Restart And Run All).

Make sure you fill in any place that says YOUR CODE HERE or "YOUR ANSWER HERE".


Q1: Convert the following into a data frame and name it df:

[[1, 2, 3],[4, 5, 6],[7, 8, 9],
[10, 11, 12],[13, 14, 15],[16, 17, 18]]

Then display the data frame.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
assert list(df[0])==[1, 4, 7, 10, 13, 16]
assert list(df[2])==[3, 6, 9, 12, 15, 18]

Q2: With reference to the above, assign the columns the following labels: 'A', 'B', and 'C'.

Assign the rows the following labels: 'U','V','W','X','Y', and 'Z'

The result should still be df. Then, display the data frame.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
assert list(df['A'])==[1, 4, 7, 10, 13, 16]
assert list(df['C'])==[3, 6, 9, 12, 15, 18]
assert list(df.loc['X'])==[10, 11, 12]

Q3: In two steps:

  • Project the first two columns and assign the result to df1
  • From df1 select all the rows from index 3 onward and assign the result to df2

Then display the result.

Think about how you might accomplish the same in one line.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
assert list(df2['A'])==[10, 13, 16]
assert list(df2['B'])==[11, 14, 17]

Q5: Convert the following dictionary (a DoL representation) to a data frame called df and then assign to variable test1 the column vector associated with the scores of Test1.

{'Name' : ['Owen', 'Rachel', 'Kyle'], 
         'Test1' : [82, 85, 95], 
         'Test2' : [89, 85, 90], 
         'Test3' : [93, 85, 87]}
In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
assert df.shape == (3,4)
assert isinstance(test1, pd.core.series.Series)
assert list(test1) == [82, 85, 95]

Q6: Print the following:

  1. The test scores of Test1 (as a Series)
  2. Rachel's test scores (as a one-row DataFrame)
  3. Kyle's third test score (as a scalar integer)
  4. Using indexing, the scores of Owen and Rachel (as a two-row DataFrame)
In [ ]:
# YOUR CODE HERE
raise NotImplementedError()

Q7: Compute a column Series, assigned to final, that equals the average of the three test scores.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
final
In [ ]:
assert isinstance(final, pd.core.series.Series)
assert list(final)[:2] == [88.0, 85.0]

Q8 Use the following data to create a DataFrame named df:

data = {'animal': ['cat','cat','snake','dog',
    'dog','cat','snake','cat','dog','dog'],
'age': [2.5, 3, 0.5, 7, 5, 2, 4.5, 4, 7, 3],
'visits': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
'priority': ['yes','yes','no','yes','no', 
             'no','no','yes','no', 'no']}

Using as index:

labels = ['a', 'b', 'c', 'd', 'e', 
          'f', 'g', 'h', 'i', 'j']

Then, carry out a selection of rows d through g, inclusive using row labels (Index elements), not positions. Assign the result to df2

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
assert df.shape == (10,4)
assert df2.shape == (4,4)
assert 'd' in df2.index
assert 'g' in df2.index

Q9: In a single assignment statement, select the data in rows 2, 3, and 4 and project the columns 'animal' and 'age', assigning to df3.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
df3
In [ ]:
assert df3.shape == (3, 2)
assert 'c' in df3.index
assert 'e' in df3.index
assert 'animal' in df3.columns
assert 'age' in df3.columns