Datasets
In Class
- European city temperatures
- Download links: Cities.csv, Countries.csv
- Join of two tables: CitiesExt.csv
- Description: Average annual temperature (in degrees Celsius) and latitude / longitude information for 213 European cities (Cities.csv), along with additional information on coastal status, EU membership, and population (in millions) of each country in Europe (Countries.csv).
- Note: Longitude is negative when West of the Prime Meridian
- Shop
- Download link: Shop.csv
- Description: Data on some shopping items.
For Assignments
- Titanic
- Download link: Titanic.csv
- Description: Data on passengers of the RMS Titanic. Entries include the name, age, class, fare, gender, and whether or not the passenger survived
- Note: Blank ages are unknown, and fare can have more than two digits because money was not base-10 at that time Titanic Fare Data
- World Cup
- Download links: Players.csv, Teams.csv
- Description: 2010 World Cup data including last name, team, position, minutes played, and game statistics for each player (Players.csv) as well as world ranking, games played in tournaments, and game statistics for each team (Teams.csv)The dataset includes all except the final game; it was published as part of a contest for data-driven predictions of the ultimate champion.
- Note: Statistics, including yellowCards and RedCards, are for the entire tournament (excluding final game). Team ranking is the world ranking going into the tournament so may not be 1-32 even though there are only 32 teams. For the joined dataset (PlayersExt.csv), keep in mind that since the tables are joined, country data will show up for each player.
Coronavirus Datasets
- Johns Hopkins dataset
- Kaggle datasets (includes country datasets and links to useful sites like WHO and CDC)
- EU CDC - publishes downloadable data daily
- Downloadable versions of ALL data from EU CDC (collected when it is published daily on the EU CDC website)
- State and county level data for coronavirus in the U.S. (compiled by The New York Times)
- Tweets from COVID-19 Twitter Stream (Compiled by Georgia State University Panacea Lab)
- EU Open Data Portal – latest publicly available data on COVID-19 in Europe and selected other countries
Other sources of datasets
- Statistical Abstracts of the US (census)
- World Bank Data
- FiveThirtyEight
- US Government's Open Data (data.gov)
- Pew Research Center
- The Data and Story Library
- Dr. John Rasp's Statistics Website
- Awesome Public Datasets
- NY Times Data Training Program (Check out the Data Sets folder, and read this article for more context)