Denison CS181/DA210 Homework

Before you turn this problem in, make sure everything runs as expected. This is a combination of restarting the kernel and then running all cells (in the menubar, select Kernel$\rightarrow$Restart And Run All).

Make sure you fill in any place that says YOUR CODE HERE or "YOUR ANSWER HERE".


In [ ]:
import json
import os
import os.path
import io
import sys
from contextlib import redirect_stdout

datadir = "publicdata"

Q1: In the data directory is a file named fib.json containing a JSON-encoded list of numbers from the Fibonacci sequence. Write a function

readJSONlist(path)

that reads and returns a list from path, assuming a JSON-encoded text file. If path does not exist, or if the result is a data structure other than a list, return the empty list. (Hint: the Python function isinstance() can be helpful to verify the type of a data structure.)

Then invoke the function on fib.json in datadir and assign the result to result.

In [ ]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
# Testing cell

assert result[-1] == 89
assert len(result) == 12

result2 = readJSONlist("./foobar.json")
assert result2 == []

result3 = readJSONlist(os.path.join(datadir, "config.json"))
assert result3 == []

Q2: Write a function

writeSquares(N, filepath)

that creates a list of the squares of integers from 1 to N and then writes them out as a JSON-encoded text file to the file location given by filepath. If N is less than or equal to 0, no file should be created. This function returns no value.

In [ ]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
# Testing cell

filepath = "./squares.json"
writeSquares(5, filepath)
result = json.load(open(filepath, 'r'))
assert result == [1, 4, 9, 16, 25]
writeSquares(1, filepath)
result = json.load(open(filepath, 'r'))
assert result == [1]
os.remove(filepath)
writeSquares(0, filepath)
assert not os.path.isfile(filepath)

Q3: In the data directory is a file named baby_2015_female_namecount.txt with a line for each of the top female baby names of 2015. Each line contains tab-separated values for the name and the count of US Social Security applications for that name. Start by writing a function

readNameCount(path)

that reads a file formatted this way, given by path and returns a tuple of two lists, the first being the names and the second being the counts. The function should behave properly regardless of how many name-count lines are in the file. If the specified path does not exist, your function should return None. Make sure your count values are integers.

Then write code that uses your function to obtain the lists and writes the result to a JSON-encoded text file in the current directory named namecount.json.

In [ ]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
# Testing cell

assert readNameCount("./foo") == None
result = readNameCount(os.path.join(datadir, "baby_2015_female_namecount.txt"))
assert len(result) == 2
assert len(result[0]) == 10
assert len(result[1]) == 10
jsonpath = os.path.join(".", "namecount.json")
assert os.path.isfile(jsonpath)
with open(jsonpath, 'r') as f:
    result2 = json.load(f)
assert len(result2) == 2
assert len(result2[0]) == 10
assert len(result2[1]) == 10

Q4: Consider the table of data:

subject name department
CS Computer Science MATH
MATH Mathematics MATH
ENGL English Literature ENGL

This could be represented in a Python program as a dictionary mapping from a key, "subjects", to a list of dictionaries, one per subject. These inner dictionaries map from "subject", "name", and "department" to the respective value for the row. Be careful and follow the above specification exactly.

Write Python code to construct the above representation as a Python in-memory data structure called ds, and then have your code encode the data structure into a JSON-formatted string called s. Print the string. Then take ds and encode as a JSON-formatted string, but include indent=2 as an argument in the conversion, and assign to s2 and print it.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
print(s)
print(s2)
In [ ]:
# Testing Cell

f = io.StringIO()
with redirect_stdout(f):
    print(s)
    
assert f.getvalue() == '{"subjects": [{"subject": "CS", "name": "Computer Science", "department": "MATH"}, {"subject": "MATH", "name": "Mathematics", "department": "MATH"}, {"subject": "ENGL", "name": "English Literature", "department": "ENGL"}]}\n'

f2 = io.StringIO()
with redirect_stdout(f2):
    print(s2)

assert f2.getvalue() == '{\n  "subjects": [\n    {\n      "subject": "CS",\n      "name": "Computer Science",\n      "department": "MATH"\n    },\n    {\n      "subject": "MATH",\n      "name": "Mathematics",\n      "department": "MATH"\n    },\n    {\n      "subject": "ENGL",\n      "name": "English Literature",\n      "department": "ENGL"\n    }\n  ]\n}\n'