This project provides a command-line interface (CLI) to manage match data, allowing you to populate data from a JSON file, scrape data from external sources, and run both tasks in sequence.
To keep dependencies isolated, create and activate a virtual environment:
python3 -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`Install the required dependencies:
pip install -r requirements.txtRun the CLI with the following command:
python cli.pyThe CLI provides the following commands:
--populateor-p: Populate match data from a JSON file.--scrapeor-s: Scrape match data from external sources.--runor-r: Run both tasks in sequence.--fetch-recentor-fr: Fetch recent matches for the selected region.
And the following options:
--help: Display help information for the command.--fetch-allor-fa: Fetch all matches for the selected region.--all-leaguesor-al: Fetch all leagues for the selected region.--playwrightor-pw: Use Playwright to scrape data. This option is available for thescrape,run, andfetch-recentcommands. Do not forget to install Playwright dependencies if you want to use this option.
playwright install
To run a command, use the following syntax:
python cli.py <command>If you don't provide a option, the CLI will prompt you to select region and league.
The scrape command allows you to scrape match data from external sources.
And writes it to matches folder in this structure:
matches
├── England-Premier-League-2024-2025
│ ├── November
│ │ ├── matches.json
│ │ ├── raw_html_<match_id>.html
│ │ ├── match_centre_event_type.json
│ │ ├── match_centre_data_<match_id>.json
│ │ ├── formation_id_name_mapppings.json
│ │ └── ...
│ ├── December
│ │ └── ...
│ └── ...
├── England-League-One-2024-2025
│ └── ...
├── tournament_url_mapping.json
├── all_regions.json
└── ...
all_regions, formation_id_name_mapppings, match_centre_event_type does not have a specific structure. Because it is same for all leagues and regions.
The populate command allows you to populate match data from a JSON file.
And writes it to matches.db file.
Or you can change the database by set the DATABASE_URI environment variable.
