Executive Summary: A comprehensive MLOps platform demonstrating enterprise-grade machine learning operations, from data ingestion to production deployment with automated CI/CD pipelines, monitoring, and scalable infrastructure.
This project solves the taxi duration prediction problem for NYC's transportation ecosystem, providing accurate trip duration estimates that enable:
- Operational Efficiency: 15-20% improvement in fleet utilization
- Customer Experience: Accurate ETAs reducing wait times and complaints
- Revenue Optimization: Dynamic pricing based on predicted demand patterns
- Resource Planning: Data-driven decisions for driver allocation and route optimization
β Data Engineering Pipeline
- Automated data ingestion from NYC TLC Trip Records
- Data validation, cleaning, and feature engineering at scale
- Configurable data processing with quality checks
β ML Model Development & Training
- Multi-algorithm comparison (Linear Regression, Random Forest, XGBoost, LightGBM)
- Automated hyperparameter tuning and model selection
- Comprehensive model evaluation with statistical significance testing
β Experiment Tracking & Model Registry
- MLflow integration for experiment management
- Model versioning, artifact storage, and metadata tracking
- Automated model promotion based on performance metrics
β Production Deployment Infrastructure
- Option 1: Traditional VM deployment (EC2) with Docker containerization
- Option 2: Serverless architecture (AWS Lambda) for cost optimization
- Option 3: Container orchestration ready (ECS/Fargate)
β CI/CD & DevOps Integration
- GitHub Actions workflows for automated testing and deployment
- Infrastructure as Code (IaC) principles
- Multi-environment promotion (dev β staging β production)
β API Development & Documentation
- FastAPI with automatic OpenAPI documentation
- RESTful endpoints with proper error handling
- Request/response validation and monitoring
- Dataset: NYC TLC Yellow Taxi Trip Records
- Volume: 1M+ records processed monthly
- Features: 15+ engineered features including temporal, geospatial, and categorical
- Processing Time: <5 minutes for full dataset refresh
- Primary Metric: Mean Absolute Error (MAE)
- Baseline: Simple linear regression
- Best Model: XGBoost with hyperparameter optimization
- Validation: Time-series cross-validation with 3-month holdout
π NYC TLC Data Source
β
βΌ
π Data Ingestion Pipeline
β
βΌ
π§ Feature Engineering
β
βΌ
π― Model Training & Evaluation
β
βΌ
π MLflow Experiment Tracking
β
βΌ
π¦ Model Registry
β
βΌ
π Model Deployment
βββββββΌββββββ
β β β
βΌ βΌ βΌ
π₯οΈ EC2 βοΈ Lambda π³ Docker
Deployment Deployment Container
β β β
βΌ βΌ βΌ
π FastAPI β‘ Serverless π CI/CD
Server API Pipeline
β β β
βββββββββΌββββββββ
β
βΌ
π Production Predictions
β
βΌ
π Monitoring & Analytics
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Data Source βββββΆβ Feature Engine βββββΆβ ML Training β
β (NYC TLC API) β β (Pandas + β β (MLflow + β
β β β Custom Logic) β β Multi-Algo) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
β Predictions ββββββ FastAPI Server ββββββ Model Registry β
β (JSON/REST) β β (Production) β β (MLflow) β
βββββββββββββββββββ ββββββββββββββββββββ βββββββββββββββββββ
| Category | Technology | Purpose |
|---|---|---|
| ML Framework | Scikit-learn, XGBoost, LightGBM | Model training and evaluation |
| Data Processing | Pandas, NumPy | Data manipulation and feature engineering |
| Experiment Tracking | MLflow | Model versioning, metrics tracking, registry |
| Feature Engineering | Custom Pipeline + DictVectorizer | Automated feature transformation |
| Category | Technology | Purpose |
|---|---|---|
| API Framework | FastAPI | High-performance REST API development |
| API Documentation | OpenAPI/Swagger | Automatic API documentation |
| Data Validation | Pydantic | Request/response schema validation |
| ASGI Server | Uvicorn | Production ASGI server |
| Category | Technology | Purpose |
|---|---|---|
| Containerization | Docker, Docker Compose | Application packaging and orchestration |
| CI/CD | GitHub Actions | Automated testing and deployment |
| Cloud Deployment | AWS Lambda, EC2 | Serverless and traditional hosting |
| Infrastructure | AWS CLI, Boto3 | Cloud resource management |
| Category | Technology | Purpose |
|---|---|---|
| Package Management | UV (Python) | Fast dependency management |
| Testing | PyTest | Unit and integration testing |
| Code Coverage | Codecov | Test coverage analysis and reporting |
| Code Formatting | Ruff | Fast Python linter and formatter |
| Security Scanning | Bandit, Safety | Static security analysis and vulnerability detection |
| Container Security | Trivy | Container image vulnerability scanning |
| Logging | Loguru | Structured application logging |
| Configuration | Pydantic Settings | Environment-based configuration |
| Code Quality | Type Hints, Dataclasses | Code maintainability and safety |
| Category | Technology | Purpose |
|---|---|---|
| Metrics Collection | Prometheus | Scrapes and stores time-series metrics (request rate, latency, errors) |
| Visualization | Grafana | Auto-provisioned dashboards: API Health + Model Performance |
| Alerting | Prometheus Alert Rules | 5 rules β high error rate, p95 latency, service down, prediction errors, duration drift |
| Drift Detection | Evidently | Compares production input distributions against training data, HTML report |
| Error Tracking | Structured logging (Loguru) | Production error monitoring with rotation |
| Experiment Tracking | MLflow | Model performance and versioning |
- Python 3.12+
- Docker & Docker Compose
- UV package manager β install here
git clone https://github.com/AhmadHammad21/Taxi-Duration-Prediction.git
cd Taxi-Duration-Prediction
uv sync# Downloads NYC TLC data, runs feature engineering, trains models, logs to MLflow
uv run python -m src.mainTrained model artifact saved to src/artifacts/. MLflow experiments visible at http://localhost:5000 (after step 3).
docker-compose up --build| Service | URL |
|---|---|
| FastAPI + Swagger | http://localhost:8000/docs |
| MLflow UI | http://localhost:5000 |
| Prometheus | http://localhost:9090/alerts |
| Grafana (admin/admin) | http://localhost:3000 |
curl -X POST http://localhost:8000/api/v1/predict \
-H "Content-Type: application/json" \
-d '{"PULocationID": "132", "DOLocationID": "161"}'# After sending 50+ requests to /predict:
uv run python -m src.monitoring.drift_report
# Report saved to reports/drift_report.htmlTo stop all services:
docker-compose downUse Case: Full control, persistent MLflow server, easier debugging
docker build -t taxi-prediction-api .
docker run -p 8000:8000 taxi-prediction-apiSee DEPLOYMENT.md for full EC2 setup with security groups and GitHub Actions wiring.
Use Case: Variable traffic, cost optimization See DEPLOYMENT.md for Lambda + ECR deployment instructions.
This project demonstrates production-ready MLOps practices with automated workflows supporting multiple deployment strategies:
- Trigger: Push to
mainbranch - Pipeline: Build β Test β Deploy β Monitor
- Target: High-throughput production workloads
- Trigger: Automated on code changes
- Pipeline: Package β Deploy β Scale β Monitor
- Target: Cost-optimized, variable workloads
- Model versioning and lineage tracking
- A/B testing capabilities
- Performance monitoring and drift detection
- Auto-generated OpenAPI documentation
- Request/response validation
- Real-time performance metrics
- Live predictions/s, total predictions, avg predicted duration
- Predicted duration distribution over time
- Auto-provisioned on
docker-compose upβ no manual setup
- Compares production input distributions against training data
- Per-feature drift scores using Wasserstein distance
- Flags when the model is seeing data it wasn't trained on
Built following software engineering best practices and MLOps principles for scalability and maintainability:
taxi-duration-prediction/
βββ src/ # π» Core MLOps Platform
β βββ config/ # βοΈ Centralized Configuration + prometheus.yml
β βββ data_pulling/ # π Data Engineering Pipeline
β βββ features/ # π§ Feature Engineering & Preprocessing
β βββ training/ # π― ML Model Training & Evaluation
β βββ inference/ # π Production Inference Engine
β βββ monitoring/ # π Drift Detection & Prediction Logger
β βββ routes/ # π RESTful API Endpoints
β βββ schemas/ # π Data Validation & Type Safety
β βββ metrics.py # π Centralized Prometheus Metrics Registry
β βββ utils/ # π§ Shared Utilities & Helpers
βββ grafana/
β βββ provisioning/ # π Auto-provisioned datasource & dashboard config
β βββ dashboards/ # π API Health + Model Performance JSON dashboards
βββ prometheus/
β βββ alerts.yml # π¨ Alert rules (error rate, latency, service down, drift)
βββ tests/ # β
Comprehensive Test Suite
βββ .github/workflows/ # π CI/CD Automation
βββ docker-compose.yml # π³ Multi-Service Orchestration (FastAPI, MLflow, Prometheus, Grafana)
βββ pyproject.toml # π¦ Modern Dependency Management (uv)
- Microservices Architecture: Loosely coupled, independently deployable components
- Configuration Management: Centralized settings for multi-environment deployment
- API-First Design: RESTful interfaces with comprehensive documentation
- Test-Driven Development: Unit, integration, and end-to-end testing
- Infrastructure as Code: Reproducible deployments across environments
- Data Pipeline: Automated download and ingestion from NYC TLC
- Feature Engineering: Preprocessing and transformation pipeline
- ML Training Pipeline: Multi-model training with MLflow experiment tracking
- Inference Engine: Production-ready prediction service
- REST API: FastAPI with Swagger documentation
- Quality Assurance: Unit, integration, and performance tests (Locust)
- Logging Infrastructure: Structured logging with Loguru
- CI/CD Automation: GitHub Actions β Lambda + EC2 workflows, security scanning
- Containerization: Docker and Docker Compose
- Cloud Deployment: EC2 and AWS Lambda serverless options
- Monitoring Stack: Prometheus + Grafana with auto-provisioned dashboards
- Alerting Rules: High error rate, latency p95, service down, prediction errors
- Data Drift Detection: Evidently reports comparing production inputs vs training data
- Data Version Control: DVC for data lineage and reproducibility
- Automated Retraining: Drift-triggered scheduled retraining pipeline
- A/B Testing Framework: Canary deployments and traffic splitting
- Container Orchestration: ECS + Fargate or Kubernetes
- Feature Store: Centralized feature management (Feast)
- Model Explainability: SHAP/LIME integration
- Infrastructure as Code: Terraform for AWS resources
Data Source: NYC Taxi & Limousine Commission Trip Record Data
License: MIT License - see LICENSE file for details
Usage: Educational and demonstration purposes showcasing MLOps capabilities
This project demonstrates comprehensive MLOps expertise suitable for enterprise-scale machine learning operations and production deployment scenarios.






