Skip to content

DevanshDaxini/Sports-EV-Bot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

207 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sports EV Bot

A multi-sport prop betting analysis engine. Uses XGBoost regression models trained on historical data to project player stats, then compares those projections against PrizePicks lines and FanDuel odds to find positive expected value (+EV) opportunities.

Supports NBA (17 stat models) and Tennis (7 market models).


How It Works

The system runs three independent analysis layers that can be used alone or combined:

1. AI Scanner

Trains XGBoost models on historical player data to predict stat lines. For NBA, it uses nba_api game logs with 100+ engineered features (rolling averages, opponent-allowed stats by position, usage rate, rest days, home/away splits). For Tennis, it uses Jeff Sackmann's open match data with 150+ features (surface, rank, H2H, recent form, fatigue).

Each model outputs a projected stat value. The scanner compares this projection to the PrizePicks line and shows the delta.

2. Odds Scanner

Fetches live odds from FanDuel via the-odds-api and compares them to PrizePicks lines. The analyzer removes the vig from FanDuel two-way markets to calculate the true implied probability, then adjusts for any line difference between platforms using logarithmic scaling. Dynamic per-stat thresholds control how large a line discrepancy is allowed before filtering it out.

Output: a ranked list of plays sorted by implied win percentage.

3. Super Scanner

Combines layers 1 and 2. Finds plays where both the AI projection and the FanDuel-implied probability agree on the same side (Over or Under). A combined confidence score weights the math edge and AI edge equally, adjusted by stat volatility. Only plays where both signals agree are surfaced.


Project Structure

sports_ev_bot/
├── main.py                          # entry point, sport selection menu
├── src/
│   ├── cli/
│   │   ├── nba_cli.py               # NBA menu (super/odds/ai scanner)
│   │   └── tennis_cli.py            # Tennis menu (scanner + setup tools)
│   ├── core/
│   │   ├── analyzers/
│   │   │   └── analyzer.py          # PropsAnalyzer: vig removal, line diff adjustment
│   │   ├── odds_providers/
│   │   │   ├── fanduel.py           # FanDuel client (the-odds-api)
│   │   │   └── prizepicks.py        # PrizePicks client (partner API)
│   │   ├── config.py                # shared cross-sport config
│   │   ├── utils.py                 # shared utilities
│   │   └── visualizer.py            # accuracy plot generation
│   └── sports/
│       ├── nba/
│       │   ├── builder.py           # download game logs via nba_api
│       │   ├── features.py          # engineer 100+ training features
│       │   ├── train.py             # train 17 XGBoost models
│       │   ├── scanner.py           # AI scanner, player scout, game scanning
│       │   ├── injuries.py          # live injury scraper (ESPN + CBS)
│       │   ├── grader.py            # backtest grading
│       │   ├── config.py            # NBA-specific constants
│       │   └── mappings.py          # stat name normalization maps
│       └── tennis/
│           ├── builder.py           # download ATP/WTA data from Sackmann GitHub
│           ├── features.py          # engineer 150+ training features
│           ├── train.py             # train 7 XGBoost models
│           ├── scanner.py           # tennis scanner + scout
│           ├── rankings.py          # live ATP/WTA ranking lookups
│           ├── config.py            # tennis-specific constants
│           └── mappings.py          # stat name normalization maps
├── data/
│   ├── nba/
│   │   ├── raw/                     # raw game logs (gitignored)
│   │   └── processed/               # engineered features (gitignored)
│   └── tennis/
│       ├── raw/                     # ATP/WTA match CSVs
│       ├── processed/               # training dataset (gitignored, 1.2GB)
│       ├── rankings_cache/          # cached ATP/WTA rankings
│       └── projections/             # scan output
├── models/
│   ├── nba/                         # XGBoost .json models (gitignored)
│   └── tennis/                      # XGBoost .json models (gitignored)
└── output/                          # scan results, CSVs (gitignored)

NBA Models

17 XGBoost regression models. Trained on ~3 seasons of game logs.

Target MAE R2 Directional Accuracy
PTS 4.75 0.469 74.4%
FGM 1.83 0.421 73.6%
PA 5.19 0.534 75.5%
PR 5.62 0.477 74.8%
PRA 6.09 0.524 76.2%
FGA 2.79 0.576 79.5%
FG3A 1.56 0.519 75.2%
FTA 1.75 0.359 71.5%
FG3M 0.96 0.281 62.2%
RA 2.67 0.430 73.4%
FTM 1.46 0.351 62.0%
REB 1.95 0.411 71.4%
AST 1.42 0.473 72.8%
STL 0.77 0.077 71.2%
TOV 0.94 0.261 62.0%
SB 0.97 0.121 52.6%
BLK 0.57 0.192 35.0%

MAE = Mean Absolute Error. R2 = coefficient of determination. Directional Accuracy = percentage of games where the model correctly predicted Over vs Under relative to the PrizePicks line.

Key Features Used

  • Rolling averages (season, last 10, last 5 games)
  • Opponent-allowed stats filtered by position (G/F/C)
  • Usage rate and missing usage from injured teammates
  • Schedule density (back-to-back, 4-in-6, days rest)
  • Home/away splits
  • Feature leakage prevention: training excludes the target game's stats from all rolling windows

Tennis Models

7 XGBoost regression models. Trained on ~1M ATP/WTA matches.

Target MAE R2 Directional Accuracy
Total Sets 0.30 0.696 85.0%
Total Games 2.99 0.697 84.0%
Games Won 2.03 0.658 80.7%
Aces 2.45 0.476 76.4%
Break Pts Won 1.11 0.463 76.1%
Double Faults 1.61 0.282 71.0%
Total Tiebreaks 0.43 0.211 67.7%

Key Features Used

  • Surface type (hard, clay, grass, carpet)
  • ATP/WTA ranking with fuzzy name matching
  • Head-to-head record
  • Recent form (last 5/10/20 matches)
  • Slam vs non-slam (best-of-5 vs best-of-3)
  • Fatigue: days since last match, matches in last 7/14/30 days

Line Difference Handling

When PrizePicks and FanDuel have different lines for the same player/stat, the system adjusts the implied probability using logarithmic scaling:

adjustment = factor * log(1 + |line_diff|) / log(2)

Each stat has its own adjustment factor (e.g. 3.5% per point for PTS, 6.0% per steal for STL) and its own maximum allowed line difference (e.g. 4.0 for PTS, 2.0 for BLK). This prevents unrealistic win percentages on large discrepancies while still capturing real edges.

If PrizePicks has a lower line than FanDuel, only the Over side is shown. If higher, only the Under. If lines match, both sides are shown.


Setup

Requirements

Installation

git clone https://github.com/DevanshDaxini/nba_ev_bot.git
cd nba_ev_bot
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Create a .env file:

ODDS_API_KEY=your_key_here

First-time Setup (NBA)

Run these in order from the main menu:

  1. Build Data -- downloads ~3 seasons of game logs via nba_api
  2. Engineer Features -- computes 100+ features per player-game
  3. Train Models -- trains 17 XGBoost models (~5 min)

First-time Setup (Tennis)

  1. Build Data -- downloads ATP/WTA match history from Sackmann GitHub
  2. Engineer Features -- computes 150+ features per match (~5 min)
  3. Train Models -- trains 7 XGBoost models (~3 min)

Usage

python main.py

Select a sport, then choose a tool:

  • Super Scanner -- finds plays where math odds and AI projection agree
  • Odds Scanner -- pure FanDuel vs PrizePicks line comparison
  • AI Scanner -- standalone AI predictions with player scouting

The player scout shows per-stat projections, PrizePicks lines (including goblin/demon alt lines marked with (G)/(D)), FanDuel-implied win percentages, and Over/Under recommendations.


API Usage

  • PrizePicks: partner API, free, no key needed. Cached for 10 minutes.
  • FanDuel (via the-odds-api): requires API key, costs credits per call. Cached for 10 minutes. The Odds Scanner auto-detects the active game slate date to minimize unnecessary calls.
  • nba_api: free, no key needed. Used for game schedules and historical data. Features a custom network resilience layer with 30-second timeouts and automatic 5-attempt rapid retries to handle frequent stats.nba.com connection blackholing.
  • ESPN/CBS Sports: scraped for live injury reports. No key needed.

Disclaimer

This software is for educational and research purposes only. Sports betting involves significant financial risk. No guarantee of profit. Not responsible for financial losses.

About

Multi-sport prop betting engine. XGBoost models + FanDuel odds + PrizePicks lines. NBA (17 models) and Tennis (7 models).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages