Tabular Data
| File | Description | Chapter(s) | 
|---|---|---|
| topnames.csv | Top US social security administration application for baby names by year and by sex. Columns of year, sex, name, count. Year range from 1880 to 2018. | 6, 7, 8, 9 | 
| topfemale.csv | Top female baby names, mapping year to name and count for years 1880 through 2018. | 6 | 
| topmale.csv | Top male baby names, mapping year to name and count for years 1880 through 2018. | 6 | 
| namesbyyear.csv | Top baby names from 2014 through 2018 with columns of years and rows of female/male sex. | 6, 9 | 
| countsbyyear.csv | Application counts for the most popular baby names from 2014 through 2018 with columns of years and rows of female/male. | 6 | 
| namesbyyear2.csv | Cells containing top baby names along with application counts from 2014 through 2018 with columns of years and rows of female/male sex. | 6 | 
| gendercount.csv | Top baby names and counts with a row per year, and using dependent data columns of FemaleName, FemaleCount, MaleName, and MaleCount. | 6 | 
| indicators2016.csv | Economic indicator data from 2016 for five countries, one per row, where country ISO code determines country name, pop, gdp, life, cell. | 7, 8 | 
| indicators.csv | Economic indicator data for 207 countries for years 1960 through 2017. Variables of (country) code and year uniquely define a row, and determine | 7, 8, 9 | 
| countries.csv | Country information for 207 countries, uniquely determined by country code (ccode), with a row per country. Columns include ccode, country (name), region (of the world the country is part of), and income (category from low income to high income). | 8, 9 | 
Relational Databases
| Database | Description | Files | ||||||
|---|---|---|---|---|---|---|---|---|
| book | Set of tables supporting book examples as described initially in Chapter 11. | 
 | ||||||
| school | Database of tables about courses, students, instructors, and departments as covered in Chapter 12 and beyond. | 
 | ||||||
| nycflights13 | Database of flights, planes, and airlines in and out of the New York City airports in 2013. | 
 | ||||||
| enron | Subset database of emails sent to and from Enron employees recovered during the investigation following fraud by the company. (Google Drive Link due to size.) | 
 | 
Hierarchical Data
| Data Set | Format Variants | Description | Chapter(s) | 
|---|---|---|---|
| ind0 | ind0.xml ind0Dict.json ind0List.json ind0_html.xml | Economic indicator data (pop and gdp) from three countries for the years 2007 and 2017. | 15, 16, 17 | 
| indicators | indicators.json indicators.xml | Economic indicator data (pop, gdp, life, cell, imports, exports) for 207 countries for years from 1960 to 2018 | 15, 17 | 
| topnames | topnames.xml topnames_html.xml | Most popular baby names based on applications to US social security administration for years from 1880 to 2018, recording top female name and top male name, and application counts for each. | 15, 17 | 
| school0 | school0.xml school0.json | Subset and hierarchical version of the school data set, based on two departments (ART and MATH) and the associated courses and classes and instructors. | 15, 17 | 
| school | school.xml | Hierarchical version of the school data set, incorporating all departments and the associated courses and classes and instructors. | 15, 16, 17 | 
Other Files
| File | Description | Chapter(s) | 
|---|---|---|
| hello.txt | Text file with single line of characters | 2 | 
| twolines.txt | Text file with two lines of characters | 2 | 
| tennyson.txt | Text file with multiple lines of characters | 2 | 
| twolines.utf16.txt | Text file with two lines of characters encoded in utf-16 | 2 | 
| baby_2010_female_name.txt | Text file with popular female names from US social security administration from 2010, one per line | 2 | 
| baby_2010_male_name.txt | Text file with popular male names from US social security administration from 2010, one per line | 2 | 
| baby_2010_female_namecount.txt | Text file with popular female name and application counts from US social security administration from 2010, one per line | 2 | 
| names.json | Text file encoded in JSON format with list of top female names from 2010 from US social security administration | 2 | 
| config.json | Text file encoded in JSON format with dictionary of configuration key names mapped to configuration values | 2 |