This repository contains scripts and tests for loading and processing data.
.github/workflows: Contains the GitHub Actions workflow files. These are used to automate tasks such as running tests and updating data.scripts: Contains Python scripts for loading and processing data.test: Contains Python unit tests for the scripts in thescriptsdirectory.requirements.txt: Lists the Python dependencies required for the scripts in this repository.README.md: This file.
The scripts directory contains the following scripts:
load_data.py: This script loads data from a specified URL and saves it to a local file.derived_data.py: This script processes the raw data loaded byload_data.py.
The test directory contains unit tests for the scripts in the scripts directory. These tests can be run using the Python unittest module.
The .github/workflows directory contains a GitHub Actions workflow that automates the process of updating data. This workflow runs on a schedule and can also be triggered manually.
To run the scripts locally, you will need to have Python installed on your machine. You can install the required dependencies with:
pip install -r requirements.txtYou can then run the scripts with:
python scripts/load_data.py
python scripts/derived_data.pyYou can run the tests with:
python -m unittest test/test_raw_data.py
python -m unittest test/test_derived_data.py