{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Denison CS181/DA210 Homework\n", "\n", "Before you turn this problem in, make sure everything runs as expected. This is a combination of **restarting the kernel** and then **running all cells** (in the menubar, select Kernel$\\rightarrow$Restart And Run All).\n", "\n", "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "datadir = \"publicdata\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q1** Write a function\n", "\n", " lineLengths(filepath)\n", "\n", "that processes a file, line by line, and accumulates and returns a list of tuples, one per line. Each tuple consists of the two value of the line number and the length of the line, excluding any leading or trailing whitespace (spaces, tabs, or newlines). Line numbers in a file start at 1. So for the `hello.txt` file:\n", "\n", " Hello, world!\n", " Welcome to 'Introduction to Data Systems'.\n", "\n", "the result is the list of tuples: `[(1, 13), (2, 42)]`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "c9ea476e9762fdce86b951cbf5169549", "grade": false, "grade_id": "cell-c5e9ddfe460baaf5", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "filepath = os.path.join(datadir, \"tennyson.txt\")\n", "lineLengths(filepath)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "0b0ed25cc1aeeccac7a20027a1c43e93", "grade": true, "grade_id": "cell-17559d5c71f8dc5b", "locked": true, "points": 3, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "\n", "filepath = os.path.join(datadir, \"tennyson.txt\")\n", "assert lineLengths(filepath) == [(1, 26), (2, 24), (3, 0), (4, 40), (5, 14)]\n", "\n", "filepath = os.path.join(datadir, \"hello.txt\")\n", "assert lineLengths(filepath) == [(1, 13), (2, 42)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q2:** Consider a variation of the babynames and counts file as depicted below:\n", "\n", " 22127, Jacob\n", " 18002, Ethan\n", " 17350, Michael\n", " 17179, Jayden\n", " 17051, William\n", " 16756, Alexander\n", "\n", "Each line of the file captures one data case/observation. The values are separated by commas, with the count occurring first and the name second, and spaces are used to align the columns of data to make it easier for a human reader.\n", "\n", "Write a function\n", "\n", " readNamesCounts(filepath)\n", "\n", "that processes the `filepath` file and yields a tuple whose first element is a reference to a list of names, and whose second element is a reference to a list of **integer** counts." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "8f06546a839f2c21fd05fc0ed63c3524", "grade": false, "grade_id": "cell-a10993da7de47401", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "path = os.path.join(datadir, \"babynames.txt\")\n", "assert os.path.isfile(path)\n", "namelist, countlist = readNamesCounts(path)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "d4191f3c14d8c6380aa964ccb26d30ed", "grade": true, "grade_id": "cell-dba242eb21fd0162", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "path = os.path.join(datadir, \"babynames.txt\")\n", "assert os.path.isfile(path)\n", "namelist, countlist = readNamesCounts(path)\n", "assert len(namelist) == 6\n", "assert len(countlist) == 6\n", "assert \"Jacob\" in namelist[0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "a93254a9e11e6cf6c40a440445a1960c", "grade": true, "grade_id": "cell-abb7140c9dbbf8c7", "locked": true, "points": 1, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.3" } }, "nbformat": 4, "nbformat_minor": 4 }