Skip to content

Latest commit

 

History

History
49 lines (31 loc) · 1.66 KB

File metadata and controls

49 lines (31 loc) · 1.66 KB

Final Project sports data

Historic sports data and prediction of game winners

[Ulrike Anklam]
[Data Analytics Berlin, 09-10-2020]

Content

Project Description

This is my 10 day final project of the Ironhack Data Analytics bootcamp. The main goal of my project was it to use a machine learning algorithm to make predictions on sports events.

Data

Overview dataset:

I used an open-source dataset from Kaggle. The dataset was created by Max Horowitz. He collected the data through the official NFL API and has since 2016 updated the dataset with the new season data.

I used the v5 dataset with play-by-play data from the 2009 season to the 2018 season, which covers all games (more 2500) and a total of 316 538 plays.

Worklflow

  • downloaded the dataset
  • notebook Data Wrangling and Cleaning
  • analysis of different features
  • created games dataset based on play-by-play data with features from each team
  • trained ML model on it

Data Storage

.csv files on google drive

Conclusion

It is possible to predict, if the hometeam in a game setting will win or not, with an 85% accuracy.

Links

Repository
Data
Slides
Tableau Trello