Author: Sylvester Wanina Cybershujaa ID: CS-DA01-25087 Date: 2 February 2026
This script scrapes hockey team statistics from ScrapeThisSite, parses the HTML table using BeautifulSoup, loads the data into a Pandas DataFrame, and exports it to a CSV file.
Make sure you have Python 3 installed, then install the required libraries:
pip install requests beautifulsoup4 pandas- Fetch the page – Sends an HTTP GET request to the target URL.
- Parse HTML – Uses BeautifulSoup to parse the page's HTML content.
- Extract table data – Locates the HTML table, reads the column headers, and iterates over each row to collect the data.
- Load into DataFrame – Stores the extracted data in a Pandas DataFrame for easy manipulation and viewing.
- Export to CSV – Saves the final DataFrame to
hockey_teams_data.csvin the working directory.
Run the script directly with Python:
python web_scraping.pyOn success, you will see the column names and the first few rows printed to the console, and a file named hockey_teams_data.csv will be created in the same directory.
| File | Description |
|---|---|
hockey_teams_data.csv |
All scraped hockey team records in CSV format |
.
├── web_scraping.py # Main scraping script
├── hockey_teams_data.csv # Output file (generated on run)
└── README.md # Project documentation
- The target site is a sandbox site designed for practicing web scraping, so no special permissions are required.
- The script skips the header row when extracting table rows to avoid including column names as data.
- Empty rows are filtered out automatically.