{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Denison CS181/DA210 Homework\n", "\n", "Before you turn this problem in, make sure everything runs as expected. This is a combination of **restarting the kernel** and then **running all cells** (in the menubar, select Kernel$\\rightarrow$Restart And Run All).\n", "\n", "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Homework: HTTP Requests" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "4425ca38ce8727c571bf1f16ffe1e078", "grade": false, "grade_id": "cell-b9abcf27cf7faf8f", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "import os\n", "import os.path\n", "import sys\n", "import importlib\n", "\n", "if os.path.isdir(os.path.join(\"../../..\", \"modules\")):\n", " module_dir = os.path.join(\"../../..\", \"modules\")\n", "else:\n", " module_dir = os.path.join(\"../..\", \"modules\")\n", "\n", "module_path = os.path.abspath(module_dir)\n", "if not module_path in sys.path:\n", " sys.path.append(module_path)\n", "\n", "import mysocket as sock\n", "importlib.reload(sock)" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "be08f9e29506dd0505f6498bc73c2d46", "grade": false, "grade_id": "cell-6984f53f77bcaf57", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "**Q1** The `requests` module uses URLs as the first argument to its HTTP method functions, but we often start with the \"piece parts\" of the information contained in a URL. Write a function\n", "\n", " buildURL(location, resource, protocol='http')\n", "\n", "that returns a string URL based on the three component parts of `protocol`, `location`, and `resource`. Your function should be flexible, so that if a user omits a leading `\\` on the resource path, one is prepended. Note that we are specifying a default value for `protocol` so that it will use `http` if `buildURL` is called with just two or three arguments. Python format strings are the right tool for the job here." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "8dc67c470c7c9777f7556721f25e606c", "grade": false, "grade_id": "cell-5da6f75039e1ff23", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "\n", "print(buildURL('httpbin.org', 'get'))\n", "print(buildURL(\"datasystems.denison.edu\",\n", " \"/data/ind0.json\", protocol=\"https\"))\n", "print(buildURL('httpbin.org', 'post'))\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "51ec12f0854f129f066997994020da06", "grade": true, "grade_id": "cell-9b6cc297980f6504", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert True" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "7ef49da7c2fc003add50b7f9c19e59e0", "grade": false, "grade_id": "cell-b8fe8d6187df0f9e", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "**Q2** Write a sequence of code that starts with:\n", "\n", " resource = \"/data/ind0.json\"\n", " location = \"datasystems.denison.edu\"\n", "\n", "and build an appropriate URL, uses `requests` to issue a GET request, and assigns the variables based on the result:\n", "\n", "- `status`: has the integer status code,\n", "- `headers`: has a dictionary of headers from the response, and\n", "- `data` has the *parsed* data from the JSON-formatted body" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "d46a6c662b3c1b781f131ccb87e68fb7", "grade": false, "grade_id": "cell-28b50f9ad2387f75", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "print(\"Status:\", status)\n", "print(headers)\n", "print(data)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "aed522f0addaa458907d61a180f90d61", "grade": true, "grade_id": "cell-8cb886623157ee0a", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert True" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "4d6ad006f9fac21247c4bebae1c8dc63", "grade": false, "grade_id": "cell-1c6b1bb5fd4133c9", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "**Q3** Suppose you often coded a similar set of steps to make a GET request, where often the body of the result was JSON, in which case you wanted the data parsed, but sometimes the data was *not* JSON, in which case you wanted the data as a string. Write a function\n", "\n", " makeRequest(location, resource, protocol=\"http\")\n", "\n", "that makes a GET request to the given `location`, `resource`, and `protocol`. If the request is *not* successful (i.e. not in the 200's), the function should check for this and return `None`. If the request is successful, the function should *use the response headers* and determine whether or not the `Content-Type` header maps to `application/json`. If it is, it should parse the result and return the data structure. If it is not, it should return the string making up the body of the response." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "09248707445c6de2d0704e5c83cf8230", "grade": false, "grade_id": "cell-f8d26c1ed859a1d0", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "04776c3919fd77c0a68378f3dce1cfc2", "grade": true, "grade_id": "cell-f4c288ebaac4a3a5", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert True" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "de52fb257f11e9527fc30aad363d3e38", "grade": false, "grade_id": "cell-b060732f8c3b0067", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "**Q4** You have probably had the experience before of trying to open a webpage, and having a redirect page pop up, telling you that the page has moved and asking if you want to be redirected. The same thing can happen when we write code to make requests. Write a function:\n", "\n", "\n", " getRedirectURL(location, resource)\n", "\n", "\n", "that begins like your function `makeRequest` but does *not* allow redirects when invoking `get`. This function will return a *url*. If the `get` results in a success status code (one in the 200's), you return the original url (obtained from `buildURL`, with `http` protocol). If you detect that `get` tried to redirect (by looking for a 300, 301, or 302 status code), **search within the headers** to find the `\"Location\"` it tried to redirect to, and return that URL instead. If you get any other status code, return `None`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "b70b6cb9c16f7352494400072270d91e", "grade": false, "grade_id": "cell-a62e6f16485ec03b", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "\n", "print(getRedirectURL(\"personal.denison.edu\", '/~kretchmar'))\n", "print(getRedirectURL(\"personal.denison.edu\", '/~kretchmar'))\n", "print(getRedirectURL(\"personal.denison.edu\", '/~whiteda/DenisonWebsiteInfo.pdf'))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "a49bed853aa10bddc1368b7b6851ac48", "grade": true, "grade_id": "cell-af47edfd14756d48", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert True" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "80fcea45742f4c8a5157ea8789f51175", "grade": false, "grade_id": "cell-05a1156011f4b5ac", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "**Q5** The section discussed how to add custom headers along with a get request. This is important in a number of contexts, e.g., when you request a large file, it might make sense to ask it to be compressed for transit. This example motivates our assert statements below. Given parallel lists `headerNameList` and `headerValueList`, you can build a dictionary that maps from header names to their associated values (given by the parallel structure). Write a function\n", "\n", "\n", " makeRequestHeader(location, resource, headerNameList, \n", " headerValueList)\n", "\n", "\n", "that builds a custom header dictionary and then passes it to the `get` method. Your function should call `buildURL` (with protocol `https`) to build the url to pass to `get`. Your function should return the response from the `get` invocation." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "f8e6af32d934acd9eaa48d6ee7ec205b", "grade": false, "grade_id": "cell-cab605385bd20904", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "\n", "r = getCustomHeader('httpbin.org', '/get/', ['Transfer-Encoding'],['compress'])\n", "request = r.request\n", "print(request.headers)\n", "print()\n", "\n", "r = getCustomHeader('httpbin.org', '/get/',['Transfer-Encoding','Accept'],['compress','text/html'])\n", "request = r.request\n", "print(request.headers)\n", "print()\n", "\n", "r = getCustomHeader('httpbin.org', '/get/',[],[])\n", "request = r.request\n", "print(request.headers)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "6fbb02170cf2f7d295bb28fb9c67cf17", "grade": true, "grade_id": "cell-b5e42afd442f334c", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "r = getCustomHeader('httpbin.org', '/get/',['Transfer-Encoding'],['compress'])\n", "request = r.request\n", "assert request.headers['Transfer-Encoding'] == 'compress'\n", "assert request.headers['Connection'] == 'keep-alive'\n", "r = getCustomHeader('httpbin.org','/get/',['Transfer-Encoding','Accept'],['compress','text/html'])\n", "request = r.request\n", "assert request.headers['Accept'] == 'text/html'\n", "assert request.headers['Transfer-Encoding'] == 'compress'\n" ] }, { "cell_type": "markdown", "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "markdown", "checksum": "b722b5b48206ab5f54fd2ce4b4979d15", "grade": false, "grade_id": "cell-4ef3aff7a10e1ba4", "locked": true, "schema_version": 3, "solution": false, "task": false } }, "source": [ "**Q6** Please write a function\n", "\n", "\n", " postData(location, resource, dataToPost)\n", "\n", "\n", "that uses your `buildURL` function to build a URL (using `https`), then uses the `requests` module to post `dataToPost` to that URL (note: `dataToPost` will be the *body* of the message you send). Please return the `response` returned by the method of `requests` that you invoke. Note: the URL `https://httpbin.org/post` is set up to allow you to post there." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "b5030907d20fecf98e229cbfc07e8f1f", "grade": false, "grade_id": "cell-41c7aa6097a4d5bc", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "\n", "response = postData('httpbin.org','/post',\"Wow, what a cool string!\")\n", "print(response.status_code)\n", "print(response.request.body)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "db04c7f3ac1bb601e53918a97880b3a2", "grade": true, "grade_id": "cell-9777279716215ad7", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "response = postData('httpbin.org','/post',\"CS181 is the best\")\n", "r = response.request\n", "assert r.method == 'POST'\n", "assert r.body == 'CS181 is the best'\n", "assert r.url == 'https://httpbin.org/post'" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }