investigating various data sources and advanced statistical methods
Let's look at radiology imaging data
- valid.csv - This data file comes from the CheXpert competition. It serves as metadata for a validation set of chest radiographs
- train.csv - This data file comes from the CheXpert competition. It serves as metadata for a training set of chest radiographs
These files were processed through extractingPatientId.py to create a column for the patient identifier (PID) and study identifier (StudyID). That made it easier to create descriptive statistics of the information.
Let's compare the top teams in the league
- BucksRosterFY20.csv - data from BasketballReference on the Milwaukee Bucks 2019-2020 roster. Downloaded 3 March 2020.
- LakersRosterFY20.csv - data from BasketballReference on the Los Angeles Lakers 2019-2020 roster. Downloaded 3 March 2020.
Let's work on acquiring Ultimate Frisbee game data
The data files in this directory were manually collected on 8 April 2020
- _tournaments.xlsx - combined manually collected data on a collection of ultimate frisbee tournaments
- tournament_results.csv - game results for a collection of ultimate frisbee tournaments
- tournament_teams.csv - teams and seedings for a collection of ultimate frisbee tournaments
These are the output locations for data scraped from the USAU tournament websites with USAU_HistData.ipynb. For example, the Florida Warmup tournament play is stored in the following structure:
- bracketplay.csv - data scraped from brackets on USAU score reporter
- poolplay.csv - data scraped from HTML data tables on USAU score reporter