Denison CS181/DA210 Homework

Before you turn this problem in, make sure everything runs as expected. This is a combination of restarting the kernel and then running all cells (in the menubar, select Kernel$\rightarrow$Restart And Run All).

Make sure you fill in any place that says YOUR CODE HERE or "YOUR ANSWER HERE".


HTTP Introductory Homework

This set of exercises focus on "raw" HTTP, where requests are built from Python strings and replies will entail the full set of bytes as received over the network from a connected server. For simple interaction with the sockets interface, they will presume use of the mysocket module described in the book, and available at the author's web site for the book. This module should be added to the user's environment so that an import mysocket is possible before solving these exercises. See the Appendix A.2 in the book for documentation on the mysocket module.

In [ ]:
import os
import os.path
import sys
import importlib

if os.path.isdir(os.path.join("../../..", "modules")):
    module_dir = os.path.join("../../..", "modules")
else:
    module_dir = os.path.join("../..", "modules")

module_path = os.path.abspath(module_dir)
if not module_path in sys.path:
    sys.path.append(module_path)

import mysocket as sock
importlib.reload(sock)

Socket Programming Requests

The first set of exercises are about making requests.

Q1 Suppose we wish to retrieve (GET) a file via HTTP (so port 80) from datasystems.denison.edu. The resource path of the file is /data/ind0.json. We wish to use version 1.1 of HTTP and to request that the connection be closed after a single request/reply exchange. We will need a header line to satisfy the HTTP 1.1 requirement of a valid Host header. Write a sequence of code to compose a valid HTTP request as a Python string, and assign the result to message.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
print(message)
print("--------------------")
In [ ]:
assert type(message) == str
assert message[:3] == "GET"
assert message[4:4+len("/data/ind0.json")] == "/data/ind0.json"
assert "Host: datasystems.denison.edu" in message
assert message.count('\r\n') == 4
assert message[-4:] == '\r\n\r\n'

Q2 Write a sequence of code to establish a connection to the host datasystems.denison.edu at port 80, to send the string message from the previous problem to the host, receive the reply from the host until the server closes the connection, assigning the reply to reply, and close the connection. Note: if the request is not completely correct, a network connection can wait forever for a reply that will never come. So if you have difficulty here, double check your answer to the previous problem.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
print(reply)
In [ ]:
assert type(reply) == str
assert "200 OK" in reply
assert "application/json" in reply
assert reply.endswith("19485.4}}}")

Q3 Suppose we want to generalize the scenario from the first exercise, where the two things that can change are the host location and the resource path. For example, we might want to change the host to httpbin.org and the resource path to /, or many other combinations. Write a function

buildRequest(location, resource)

that constructs and returns a Python string containing a valid HTTP GET request that incorporates the parameters location and resource into the request at the appropriate places, and includes the appropriate header lines (for the required Host and to request the server close the connection after the exchange).

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
print(buildRequest("httpbin.org", "/get"))
print("---------------------")
In [ ]:
r1 = buildRequest("datasystems.denison.edu", "/data/ind0.json")
assert r1[:3] == "GET"
assert r1[4:4+len("/data/ind0.json")] == "/data/ind0.json"
assert "Host: datasystems.denison.edu" in r1
assert r1.count('\r\n') == 4
assert r1[-4:] == '\r\n\r\n'
r2 = buildRequest("httpbin.org", "/get")
assert r2[:3] == "GET"
assert r2[4:4+len("/get")] == "/get"
assert "Host: httpbin.org" in r2
assert r2.count('\r\n') == 4
assert r2[-4:] == '\r\n\r\n'

Q4 Write a function

makeRequest(location, resource)

that first constructs a valid HTTP GET request for resource at host location, as a Python string (using your function from the previous question), and then performs the request-reply steps of making the connection, sending the string request, receiving a reply until the connection closes, and finally closing the client side of the connection. The function should return the reply.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
print(makeRequest("datasystems.denison.edu", "/basic.html"))
In [ ]:
resp1 = makeRequest("datasystems.denison.edu", "/basic.html")
#print(resp1)
assert "200 OK" in resp1
assert "text/html" in resp1
assert resp1.endswith("</html>\n")

resp2 = makeRequest("datasystems.denison.edu", "/data/ind0.json")
#print(resp2)
assert "200 OK" in resp2
assert "application/json" in resp2
assert resp2.endswith("19485.4}}}")

resp3 = makeRequest("httpbin.org", "/get")
#print(resp3)
assert "200 OK" in resp3
assert "application/json" in resp3
assert resp3.endswith(""""url": "http://httpbin.org/get"\n}\n""")

Programming Response Replies

The next set of exercises are about parsing through the reply resulting from a request. If we consider an HTTP reply, we can partition it into a status line, the set of headers, and the body. The exercises ask for functions that, given a reply, and parse the reply and return each of these pieces.

Q5: Write a function

parseStatus(reply)

that finds and returns a Python string consisting of only the status line of a reply. The returned value should include the line-terminating "\r\n".

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
reply = makeRequest("datasystems.denison.edu", "/basic.html")
print(repr(parseStatus(reply)))
reply = makeRequest("datasystems.denison.edu", "/foobar.txt")
print(repr(parseStatus(reply)))
In [ ]:
r1 = makeRequest("datasystems.denison.edu", "/basic.html")
s1 = parseStatus(r1)
assert s1 == "HTTP/1.1 200 OK\r\n"

r2 = makeRequest("datasystems.denison.edu", "/foobar.txt")
s2 = parseStatus(r2)
assert s2 == "HTTP/1.1 404 Not Found\r\n"

Q6: Write a function

parseHeaders(reply)

that finds and returns a single Python string that starts with the first header in the reply and continues up through the last header in the reply, including the line-terminating "\r\n", but not the empty line separating the headers from the body.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
reply = makeRequest("datasystems.denison.edu", "/basic.html")
print(repr(parseHeaders(reply)))
reply = makeRequest("datasystems.denison.edu", "/foobar.txt")
print(repr(parseHeaders(reply)))
In [ ]:
r1 = makeRequest("datasystems.denison.edu", "/basic.html")
h1 = parseHeaders(r1)
assert "Server: Apache" in h1
assert "Connection: close\r\n" in h1
assert "Content-Type: text/html" in h1
r2 = makeRequest("datasystems.denison.edu", "/foobar.txt")
h2 = parseHeaders(r2)
assert "Server: Apache" in h2
assert "Connection: close\r\n" in h2
assert "Content-Type: text/html" in h2

Q7: Write a function

parseBody(reply)

that finds and returns a single Python string that starts with the beginning of the body (i.e. after the empty line of the reply) and continues to the end of the reply.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
reply = makeRequest("datasystems.denison.edu", "/basic.html")
print(parseBody(reply))
reply = makeRequest("datasystems.denison.edu", "/foobar.txt")
print(parseBody(reply))
In [ ]:
r1 = makeRequest("datasystems.denison.edu", "/basic.html")
b1 = parseBody(r1)
r2 = makeRequest("datasystems.denison.edu", "/foobar.txt")
b2 = parseBody(r2)
assert b1.startswith("<!DOCTYPE html>")
assert b1.endswith("</html>\n")
assert b2.startswith("<!DOCTYPE HTML")
assert b2.endswith("</body></html>\n")