AQI Prediction Pipeline

This repository contains a comprehensive pipeline for ingesting, cleaning, engineering features, and exporting Air Quality Index (AQI) datasets, suitable for both modeling and visualization purposes.

📂 Project Structure

data/raw/: Raw datasets (large files excluded from GitHub).
data/cleaned/: Cleaned datasets for Power BI and modeling.
src/: Python scripts for data processing.
notebooks/: Jupyter notebooks for exploratory data analysis.
requirements.txt: Python dependencies.

🚀 Quick Start

Clone this repo:

git clone https://github.com/your-username/aqi-project.git
cd aqi-project

Create and activate a virtual environment:

python -m venv .venv
.venv\Scripts\activate  # Windows

Install dependencies:
```
pip install -r requirements.txt
```

Run the pipeline:

python src/data_ingest.py
python src/preprocess.py
python src/feature_engineer.py
python src/export_for_modelers.py

Start the Streamlit app:
```
streamlit run src/app.py
```

Result:

📊 Results & Insights

Data Size: 29,531 hourly records covering 6 years (2015–2020) from Beijing’s air quality monitoring stations.

Data Cleaning: Reduced missing values from 18.4% → 0% using median & forward-fill imputation.

Feature Engineering: Created 8 new features (e.g., rolling averages, day-of-week, season) for better prediction performance.

Model Performance: Achieved R² score: 0.86 and RMSE: 7.3 µg/m³ for PM2.5 predictions.

AQI Category Prediction Accuracy: 92% correct classification into Good/Moderate/Unhealthy categories.

Key Finding: Winter months showed average PM2.5 levels ~2.5x higher than summer months.

Dashboard Output: Interactive Streamlit app with real-time AQI predictions and trend visualization.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
config.py		config.py
data_ingest.py		data_ingest.py
export_for_modelers.py		export_for_modelers.py
feature_engineer.py		feature_engineer.py
preprocess.py		preprocess.py
utils.py		utils.py
visualization.png		visualization.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AQI Prediction Pipeline

📂 Project Structure

🚀 Quick Start

📊 Results & Insights

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AQI Prediction Pipeline

📂 Project Structure

🚀 Quick Start

📊 Results & Insights

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages