Denison CS181/DA210 Homework

Before you turn this problem in, make sure everything runs as expected. This is a combination of restarting the kernel and then running all cells (in the menubar, select Kernel$\rightarrow$Restart And Run All).

Make sure you fill in any place that says YOUR CODE HERE or "YOUR ANSWER HERE".


Homework: HTTP Requests

In [ ]:
import os
import os.path
import sys
import importlib

if os.path.isdir(os.path.join("../../..", "modules")):
    module_dir = os.path.join("../../..", "modules")
else:
    module_dir = os.path.join("../..", "modules")

module_path = os.path.abspath(module_dir)
if not module_path in sys.path:
    sys.path.append(module_path)

import mysocket as sock
importlib.reload(sock)

Q1 The requests module uses URLs as the first argument to its HTTP method functions, but we often start with the "piece parts" of the information contained in a URL. Write a function

buildURL(location, resource, protocol='http')

that returns a string URL based on the three component parts of protocol, location, and resource. Your function should be flexible, so that if a user omits a leading \ on the resource path, one is prepended. Note that we are specifying a default value for protocol so that it will use http if buildURL is called with just two or three arguments. Python format strings are the right tool for the job here.

In [ ]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()

print(buildURL('httpbin.org', 'get'))
print(buildURL("datasystems.denison.edu",
               "/data/ind0.json", protocol="https"))
print(buildURL('httpbin.org', 'post'))
In [ ]:
assert True

Q2 Write a sequence of code that starts with:

resource = "/data/ind0.json"
location = "datasystems.denison.edu"

and build an appropriate URL, uses requests to issue a GET request, and assigns the variables based on the result:

  • status: has the integer status code,
  • headers: has a dictionary of headers from the response, and
  • data has the parsed data from the JSON-formatted body
In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
print("Status:", status)
print(headers)
print(data)
In [ ]:
assert True

Q3 Suppose you often coded a similar set of steps to make a GET request, where often the body of the result was JSON, in which case you wanted the data parsed, but sometimes the data was not JSON, in which case you wanted the data as a string. Write a function

 makeRequest(location, resource, protocol="http")

that makes a GET request to the given location, resource, and protocol. If the request is not successful (i.e. not in the 200's), the function should check for this and return None. If the request is successful, the function should use the response headers and determine whether or not the Content-Type header maps to application/json. If it is, it should parse the result and return the data structure. If it is not, it should return the string making up the body of the response.

In [ ]:
# YOUR CODE HERE
raise NotImplementedError()
In [ ]:
assert True

Q4 You have probably had the experience before of trying to open a webpage, and having a redirect page pop up, telling you that the page has moved and asking if you want to be redirected. The same thing can happen when we write code to make requests. Write a function:

getRedirectURL(location, resource)


that begins like your function makeRequest but does not allow redirects when invoking get. This function will return a url. If the get results in a success status code (one in the 200's), you return the original url (obtained from buildURL, with http protocol). If you detect that get tried to redirect (by looking for a 300, 301, or 302 status code), search within the headers to find the "Location" it tried to redirect to, and return that URL instead. If you get any other status code, return None.

In [ ]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()

print(getRedirectURL("personal.denison.edu", '/~kretchmar'))
print(getRedirectURL("personal.denison.edu", '/~kretchmar'))
print(getRedirectURL("personal.denison.edu", '/~whiteda/DenisonWebsiteInfo.pdf'))
In [ ]:
assert True

Q5 The section discussed how to add custom headers along with a get request. This is important in a number of contexts, e.g., when you request a large file, it might make sense to ask it to be compressed for transit. This example motivates our assert statements below. Given parallel lists headerNameList and headerValueList, you can build a dictionary that maps from header names to their associated values (given by the parallel structure). Write a function

makeRequestHeader(location, resource, headerNameList, 
                                      headerValueList)


that builds a custom header dictionary and then passes it to the get method. Your function should call buildURL (with protocol https) to build the url to pass to get. Your function should return the response from the get invocation.

In [ ]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()

r = getCustomHeader('httpbin.org', '/get/', ['Transfer-Encoding'],['compress'])
request = r.request
print(request.headers)
print()

r = getCustomHeader('httpbin.org', '/get/',['Transfer-Encoding','Accept'],['compress','text/html'])
request = r.request
print(request.headers)
print()

r = getCustomHeader('httpbin.org', '/get/',[],[])
request = r.request
print(request.headers)
In [ ]:
# Testing cell

r = getCustomHeader('httpbin.org', '/get/',['Transfer-Encoding'],['compress'])
request = r.request
assert request.headers['Transfer-Encoding'] == 'compress'
assert request.headers['Connection'] == 'keep-alive'
r = getCustomHeader('httpbin.org','/get/',['Transfer-Encoding','Accept'],['compress','text/html'])
request = r.request
assert request.headers['Accept'] == 'text/html'
assert request.headers['Transfer-Encoding'] == 'compress'

Q6 Please write a function

postData(location, resource, dataToPost)


that uses your buildURL function to build a URL (using https), then uses the requests module to post dataToPost to that URL (note: dataToPost will be the body of the message you send). Please return the response returned by the method of requests that you invoke. Note: the URL https://httpbin.org/post is set up to allow you to post there.

In [ ]:
# Solution cell

# YOUR CODE HERE
raise NotImplementedError()

response = postData('httpbin.org','/post',"Wow, what a cool string!")
print(response.status_code)
print(response.request.body)
In [ ]:
# Testing cell

response = postData('httpbin.org','/post',"CS181 is the best")
r = response.request
assert r.method == 'POST'
assert r.body == 'CS181 is the best'
assert r.url == 'https://httpbin.org/post'