Denison CS181/DA210 Homework¶

Before you turn this problem in, make sure everything runs as expected. This is a combination of restarting the kernel and then running all cells (in the menubar, select Kernel$\rightarrow$Restart And Run All).

Make sure you fill in any place that says YOUR CODE HERE or "YOUR ANSWER HERE".

import os
import os.path
import io
import sys

datadir = "publicdata"

Q1 Write a function

makeDict(L)

that creates a dictionary whose keys are the items of L and whose values are the corresponding indices of L. You may assume L is a list without any duplicate items.

# Solution cell

# YOUR CODE HERE
raise NotImplementedError()

# Testing cell

L1 = [5, 4, 3, 2, 1]
expect1 = {5: 0, 4: 1, 3: 2, 2: 3, 1:4}
assert makeDict(L1) == expect1
L2 = "Piotr Piper Picked a Pack of Pickled Peppers".split()
expect2 = {'Piotr': 0, 'Piper': 1, 'Picked': 2, "a": 3, "Pack":4, "of": 5, "Pickled": 6, "Peppers": 7}
assert makeDict(L2) == expect2

Q2 Write a function

filmography(L)

whose parameter L is a list of tuples of the form (actor, movie), and creates and returns a dictionary where the keys are actor names and the values are lists of movies that actor has been in. For example, given

L = [(‘DiCaprio’,‘Revenant’),
    (‘Cumberbatch’,‘Hobbit’),
    (‘DiCaprio’,‘Gatsby’),
    (‘DiCaprio’,‘Django’)]

then you should return

{‘DiCaprio’:[‘Revenant’,‘Gatsby’,‘Django’],
‘Cumberbatch’:[‘Hobbit’]}

# YOUR CODE HERE
raise NotImplementedError()

L = [('DiCaprio','Revenant'),
    ('Cumberbatch','Hobbit'),
    ('DiCaprio','Gatsby'),
    ('DiCaprio','Django')]
D = filmography(L)
print(D)

L = [('DiCaprio','Revenant'),
    ('Cumberbatch','Hobbit'),
    ('DiCaprio','Gatsby'),
    ('DiCaprio','Django')]
D = filmography(L)
assert len(D) == 2
assert 'DiCaprio' in D.keys()
assert 'Django' in D['DiCaprio']
assert 'Cumberbatch' in D.keys()
assert D['Cumberbatch'] == ['Hobbit']

Q1 Write a function

readNamesCounts(path)

that reads from the file at location path and returns a dictionary with the information from the file. The format of the file is a CSV file. The first line is a comma separated set of string names for the columns contained in the file. In this case year,name,count. Subsequent lines have, on each line, comma separated values for a data mapping, where the first value gives the independent variable value of the year, and the second and third values on the line give the string name and the integer count.

The returned dictionary should map from years to two-tuples consisting of a name and an integer count. Make sure your year keys are integers. Assign to variable female the dictionary obtained from topfemale.csv and assign to variable male the dictionary obtained from topmale.csv. Both files may be found in the datadir directory.

# Solution cell

# YOUR CODE HERE
raise NotImplementedError()

# Testing cell

femalepath = os.path.join(datadir, "topfemale.csv")
females = readNamesCounts(femalepath)
assert len(females) == 139
assert 1880 in females
assert 2018 in females
assert females[1880] == ('Mary', 7065)
assert females[2018] == ('Emma', 18688)

malepath = os.path.join(datadir, "topmale.csv")
males = readNamesCounts(malepath)
assert len(females) == 139
assert 1880 in females
assert 2018 in females
assert males[1880] == ('John', 9655)
assert males[2018] == ('Liam', 19837)