28 YEARS OF UFC HISTORY

EXTRACTION, PREPARATION, AND ANALYSIS OF THE ULTIMATE FIGHTING CHAMPIONSHIP HISTORICAL DATA

INTRO

The main goal of this project is to perform a simple statistical analysis on UFC fighters. I will answer the following questions:

1. what fighter has the most knock outs?

2. What is the standard deviation of a fighter’s height in each weight-division?

3. How does height distribution look like in each weight-division?

A detailed explanation of the analysis can be found in this project's Python jupyter-notebook.

DATA EXTRACTION

I built a web-scraping Python script that downloads public data from www.ufcstats.com. The raw dataset contains a historical roster of fighters in the UFC; from the year 1993 to present.

DATA PREPARATION

The raw data does not contain a gender attribute by default. A classifyer was built to predict the gender of a fighter based on their name.

BUILDING THE GENDER CLASSIFYER

I built a brute-force search algorithm using Python that predicts fighters' gender given a name. The algorithm uses historical names from the U.S national database www.datagov.org to determine gender based on the relative proportion of males/females. The classifyer attained 96% precision and 70% recall.

The females' feather weight-division is the only set of fighters that is already classified, as females do not have a feather weight-division. I used the female dataset to evaluate the precision and recall of the gender classifier. After running the classifier through the names in the feather weight-division, the classifier had 96% precision and 70% recall. I could improve the classifier by feeding a machine learning model with some fighter attributes such as name, weight, height, and reach. However, for the purposes of this project, 96% precision and 70% recall are good metrics to keep moving forward.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
name_sex_classifier		name_sex_classifier
Predicting the Winner of a Match.ipynb		Predicting the Winner of a Match.ipynb
exploratory_analysis.ipynb		exploratory_analysis.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

28 YEARS OF UFC HISTORY

INTRO

1. what fighter has the most knock outs?

2. What is the standard deviation of a fighter’s height in each weight-division?

3. How does height distribution look like in each weight-division?

DATA EXTRACTION

DATA PREPARATION

BUILDING THE GENDER CLASSIFYER

DATA ANALYSIS

About

Uh oh!

Releases

Packages

Languages

bennjordan/UFC-Data-Analysis

Folders and files

Latest commit

History

Repository files navigation

28 YEARS OF UFC HISTORY

INTRO

1. what fighter has the most knock outs?

2. What is the standard deviation of a fighter’s height in each weight-division?

3. How does height distribution look like in each weight-division?

DATA EXTRACTION

DATA PREPARATION

BUILDING THE GENDER CLASSIFYER

DATA ANALYSIS

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages