Before you turn this problem in, make sure everything runs as expected. This is a combination of restarting the kernel and then running all cells (in the menubar, select Kernel$\rightarrow$Restart And Run All).
Make sure you fill in any place that says YOUR CODE HERE
or "YOUR ANSWER HERE".
import json
import os
import os.path
import io
import sys
from contextlib import redirect_stdout
datadir = "publicdata"
Q1: In the data directory is a file named fib.json
containing a JSON-encoded list of numbers from the Fibonacci sequence. Write a function
readJSONlist(path)
that reads and returns a list from path
, assuming a JSON-encoded text file. If path
does not exist, or if the result is a data structure other than a list, return the empty list. (Hint: the Python function isinstance()
can be helpful to verify the type of a data structure.)
Then invoke the function on fib.json
in datadir
and assign the result to result
.
# Solution cell
# YOUR CODE HERE
raise NotImplementedError()
# Testing cell
assert result[-1] == 89
assert len(result) == 12
result2 = readJSONlist("./foobar.json")
assert result2 == []
result3 = readJSONlist(os.path.join(datadir, "config.json"))
assert result3 == []
Q2: Write a function
writeSquares(N, filepath)
that creates a list of the squares of integers from 1 to N
and then writes them out as a JSON-encoded text file to the file location given by filepath
. If N is less than or equal to 0, no file should be created. This function returns no value.
# Solution cell
# YOUR CODE HERE
raise NotImplementedError()
# Testing cell
filepath = "./squares.json"
writeSquares(5, filepath)
result = json.load(open(filepath, 'r'))
assert result == [1, 4, 9, 16, 25]
writeSquares(1, filepath)
result = json.load(open(filepath, 'r'))
assert result == [1]
os.remove(filepath)
writeSquares(0, filepath)
assert not os.path.isfile(filepath)
Q3: In the data directory is a file named baby_2015_female_namecount.txt
with a line for each of the top female baby names of 2015. Each line contains tab-separated values for the name and the count of US Social Security applications for that name. Start by writing a function
readNameCount(path)
that reads a file formatted this way, given by path
and returns a tuple of two lists, the first being the names and the second being the counts. The function should behave properly regardless of how many name-count lines are in the file. If the specified path does not exist, your function should return None
. Make sure your count values are integers.
Then write code that uses your function to obtain the lists and writes the result to a JSON-encoded text file in the current directory named namecount.json
.
# Solution cell
# YOUR CODE HERE
raise NotImplementedError()
# Testing cell
assert readNameCount("./foo") == None
result = readNameCount(os.path.join(datadir, "baby_2015_female_namecount.txt"))
assert len(result) == 2
assert len(result[0]) == 10
assert len(result[1]) == 10
jsonpath = os.path.join(".", "namecount.json")
assert os.path.isfile(jsonpath)
with open(jsonpath, 'r') as f:
result2 = json.load(f)
assert len(result2) == 2
assert len(result2[0]) == 10
assert len(result2[1]) == 10
Q4: Consider the table of data:
subject | name | department |
---|---|---|
CS | Computer Science | MATH |
MATH | Mathematics | MATH |
ENGL | English Literature | ENGL |
This could be represented in a Python program as a dictionary mapping from a key, "subjects"
, to a list of dictionaries, one per subject. These inner dictionaries map from "subject"
, "name"
, and "department"
to the respective value for the row. Be careful and follow the above specification exactly.
Write Python code to construct the above representation as a Python in-memory data structure called ds
, and then have your code encode the data structure into a JSON-formatted string called s
. Print the string. Then take ds
and encode as a JSON-formatted string, but include indent=2
as an argument in the conversion, and assign to s2
and print it.
# YOUR CODE HERE
raise NotImplementedError()
print(s)
print(s2)
# Testing Cell
f = io.StringIO()
with redirect_stdout(f):
print(s)
assert f.getvalue() == '{"subjects": [{"subject": "CS", "name": "Computer Science", "department": "MATH"}, {"subject": "MATH", "name": "Mathematics", "department": "MATH"}, {"subject": "ENGL", "name": "English Literature", "department": "ENGL"}]}\n'
f2 = io.StringIO()
with redirect_stdout(f2):
print(s2)
assert f2.getvalue() == '{\n "subjects": [\n {\n "subject": "CS",\n "name": "Computer Science",\n "department": "MATH"\n },\n {\n "subject": "MATH",\n "name": "Mathematics",\n "department": "MATH"\n },\n {\n "subject": "ENGL",\n "name": "English Literature",\n "department": "ENGL"\n }\n ]\n}\n'