{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Denison CS181/DA210 Homework\n", "\n", "Before you turn this problem in, make sure everything runs as expected. This is a combination of **restarting the kernel** and then **running all cells** (in the menubar, select Kernel$\\rightarrow$Restart And Run All).\n", "\n", "Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\"." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "> **In the questions that follow, we are looking for XPath declarative solutions to the problems, not procedural solutions. You will only get 1/2 credit for procedural solutions.**\n", "\n", "Please begin by importing whatever modules you need, reading in and parsing the relevant datasets, and familiarizing yourself with them." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q1:** Using the provided `bookstore.xml` file, create a Python list called \"books\" containing the **titles** of all books. Your list `books` should be a list of strings." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "e45d082a7c13fa6084eea136d0398e6e", "grade": false, "grade_id": "cell-1d693d29dd846dd8", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "books = []\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "print(books)\n", "type(books[0])" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "c580f2492fb5120ba6270c71f59ae05d", "grade": true, "grade_id": "cell-a8b4fd3cbd5c6d8d", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert len(books) > 0 and type(books[0]) is etree._ElementUnicodeResult\n", "assert 'Lover Birds' in books and 'Splish Splash' in books\n", "assert len(books)==12" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q2:** Create a list of books ids named `less` that cost less than `$6`. Note that `id` is an attribute." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "1b2cdec29dc3a01b9b3afec57e55ab2c", "grade": false, "grade_id": "cell-6e3c08a87e87befc", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "less = []\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "less" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "33b725dd2e2682a54addcf00aa428479", "grade": true, "grade_id": "cell-caa0de2ac1828f14", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert len(less) > 0 and type(less[0]) is etree._ElementUnicodeResult\n", "assert 'bk104' in less\n", "assert 'bk101' not in less\n", "assert len(less)==7" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q3:** Create a list of book titles called \"eva\" where Eva Corets was the author. Your list `eva` should be a list of strings." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "212112f8a8aee363c85113cde65e33c5", "grade": false, "grade_id": "cell-7c7d0674f1ec9421", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "eva = []\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "eva" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "a9345d699d683b760b053939b1754738", "grade": true, "grade_id": "cell-331030ba75384a5c", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert len(eva) > 0 and type(eva[0]) is etree._ElementUnicodeResult\n", "assert len(eva)==3\n", "assert 'Maeve Ascendant' in eva\n", "assert 'Paradox Lost' not in eva" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q4:** Find the average book price for all books that are not fantasy in this file, assigning to variable `avgprice`. **Hints** First, use XPath to get a list of the price strings (text) based on a single XPath query. Then use a list comprehension to build a list of `float` values converting the strings to real-valued numbers. Finally, perform the average based on the values and length of the list." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "f287cb4325ef5c210c792b0d0a4dbf90", "grade": false, "grade_id": "cell-6d115874424475c8", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "avgprice = 0\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "avgprice" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "38a05f1ddc522aa0abfc25f05d492d40", "grade": true, "grade_id": "cell-5272bfd99dcd3cab", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert(avgprice > 23.82)\n", "assert(avgprice < 24)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q5:** Create a list called `lessFantasy` containing the titles of the books where the price is under `$40` and not in the fantasy genre." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "e49f6dd27ec9fea18c16417288e7c1e8", "grade": false, "grade_id": "cell-f712efddd8ca1dbd", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "lessFantasy = []\n", "# YOUR CODE HERE\n", "raise NotImplementedError()\n", "lessFantasy" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "a6be7531a0f1f0115d455e154b2621e5", "grade": true, "grade_id": "cell-d20012db748daea5", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert len(lessFantasy)==6\n", "assert 'Paradox Lost' in lessFantasy\n", "assert 'Maeve Ascendant' not in lessFantasy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q6:** Using `countries.xml`, generate a list of all the countries in the `countries.xml` file, assigning to a variable `countries`; then assign the number of countries to the variable `countrycount`. When you read in and parse the file, please name the root element `croot`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "b0b2d0bb01a21a24f4474b9cc1f9184a", "grade": false, "grade_id": "cell-25002dba76a305fa", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "ef3f4dbe66a79ef0f9e8c13f65d0bfb9", "grade": true, "grade_id": "cell-e6b57802a3191252", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert(countrycount == 231)\n", "assert('Uruguay' in countries)\n", "assert type(croot) is etree._Element" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q7:** Write a function `findPop(root,country)` that finds the population of a given `country` in the dataset `countries.xml`. Use an XPath expression and a format string. Return your answer as an integer." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "2b64775959c8c67cb943c1411cc94d35", "grade": false, "grade_id": "cell-5f336a2f4c0c938e", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# Solution cell\n", "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "e8c84a7787e151ea7b0e5f3142465e62", "grade": true, "grade_id": "cell-8e1e34e520cb833f", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "# Testing cell\n", "assert findPop(croot,'Cuba') == 10951334\n", "assert findPop(croot,'Uruguay') == 3238952" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q8:** Study the `countries` data carefully. Then use the `position()` function to create a node set consisting of, for countries in positions 5-55 inclusive, the population of the second city listed, if there are at least two cities listed. For example, nothing is in the node set for Aruba (no cities listed) or Armenia (only Yerevan listed), but Cordoba is in the node set thanks to Argentina. Your answer should use a single XPath expression. Please store the results in a list `secondPops` of integers." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "fe82e0817ecd91fa3702b21c75bf3491", "grade": false, "grade_id": "cell-0f4c2d6f97e62299", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "9e4122e297e2ab21f3eec5f2d9eac45b", "grade": true, "grade_id": "cell-c8e11dca9ab80a40", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert len(secondPops) == 6\n", "assert secondPops[0] == 1111811" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q9:** With reference to the `topnames` dataset, please find all years where there was a count (either gender) that was strictly larger than 50,000. Please navigate to the appropriate attribute, rather than returning a list of elements." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "5f9250440cf720eb223d68d28673cf21", "grade": false, "grade_id": "cell-3fc5a4015ec5e485", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "9e838bb762fec60b40deea0810d3a7a4", "grade": true, "grade_id": "cell-49d4e8d051b06d3d", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert nodeset[0] == '1915'\n", "assert len(nodeset) == 78" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Q10:** With reference to the `topnames` dataset, please find all years where the top female name had a count that was strictly larger than 50,000. Please navigate to the appropriate attribute, rather than returning a list of elements." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "nbgrader": { "cell_type": "code", "checksum": "50f3cda5c90e394021d1fa1280fa4d42", "grade": false, "grade_id": "cell-7b7db8ff25fd5342", "locked": false, "schema_version": 3, "solution": true, "task": false } }, "outputs": [], "source": [ "# YOUR CODE HERE\n", "raise NotImplementedError()\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "deletable": false, "editable": false, "nbgrader": { "cell_type": "code", "checksum": "18be402018526ec031ac28b270a20d41", "grade": true, "grade_id": "cell-462efe3b0c32724c", "locked": true, "points": 2, "schema_version": 3, "solution": false, "task": false } }, "outputs": [], "source": [ "assert nodeset[0] == '1915'\n", "assert len(nodeset) == 68" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "nbformat": 4, "nbformat_minor": 4 }