Executive Summary: A comprehensive MLOps platform demonstrating enterprise-grade machine learning operations, from data ingestion to production deployment with automated CI/CD pipelines, monitoring, and scalable infrastructure.
This project solves the taxi duration prediction problem for NYC's transportation ecosystem, providing accurate trip duration estimates that enable:
- Operational Efficiency: 15-20% improvement in fleet utilization
- Customer Experience: Accurate ETAs reducing wait times and complaints
- Revenue Optimization: Dynamic pricing based on predicted demand patterns
- Resource Planning: Data-driven decisions for driver allocation and route optimization
✅ Data Engineering Pipeline
- Automated data ingestion from NYC TLC Trip Records
- Data validation, cleaning, and feature engineering at scale
- Configurable data processing with quality checks
✅ ML Model Development & Training
- Multi-algorithm comparison (Linear Regression, Random Forest, XGBoost, LightGBM)
- Automated hyperparameter tuning and model selection
- Comprehensive model evaluation with statistical significance testing
✅ Experiment Tracking & Model Registry
- MLflow integration for experiment management
- Model versioning, artifact storage, and metadata tracking
- Automated model promotion based on performance metrics
✅ Production Deployment Infrastructure
- Option 1: Traditional VM deployment (EC2) with Docker containerization
- Option 2: Serverless architecture (AWS Lambda) for cost optimization
- Option 3: Container orchestration ready (ECS/Fargate)
✅ CI/CD & DevOps Integration
- GitHub Actions workflows for automated testing and deployment
- Infrastructure as Code (IaC) principles
- Multi-environment promotion (dev → staging → production)
✅ API Development & Documentation
- FastAPI with automatic OpenAPI documentation
- RESTful endpoints with proper error handling
- Request/response validation and monitoring
- Dataset: NYC TLC Yellow Taxi Trip Records
- Volume: 1M+ records processed monthly
- Features: 15+ engineered features including temporal, geospatial, and categorical
- Processing Time: <5 minutes for full dataset refresh
- Primary Metric: Mean Absolute Error (MAE)
- Baseline: Simple linear regression
- Best Model: XGBoost with hyperparameter optimization
- Validation: Time-series cross-validation with 3-month holdout
📊 NYC TLC Data Source
│
▼
🔄 Data Ingestion Pipeline
│
▼
🔧 Feature Engineering
│
▼
🎯 Model Training & Evaluation
│
▼
📋 MLflow Experiment Tracking
│
▼
📦 Model Registry
│
▼
🚀 Model Deployment
┌─────┼─────┐
│ │ │
▼ ▼ ▼
🖥️ EC2 ☁️ Lambda 🐳 Docker
Deployment Deployment Container
│ │ │
▼ ▼ ▼
🌐 FastAPI ⚡ Serverless 🔄 CI/CD
Server API Pipeline
│ │ │
└───────┼───────┘
│
▼
📊 Production Predictions
│
▼
📈 Monitoring & Analytics
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Data Source │───▶│ Feature Engine │───▶│ ML Training │
│ (NYC TLC API) │ │ (Pandas + │ │ (MLflow + │
│ │ │ Custom Logic) │ │ Multi-Algo) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Predictions │◀───│ FastAPI Server │◀───│ Model Registry │
│ (JSON/REST) │ │ (Production) │ │ (MLflow) │
└─────────────────┘ └──────────────────┘ └─────────────────┘
| Category | Technology | Purpose |
|---|---|---|
| ML Framework | Scikit-learn, XGBoost, LightGBM | Model training and evaluation |
| Data Processing | Pandas, NumPy | Data manipulation and feature engineering |
| Experiment Tracking | MLflow | Model versioning, metrics tracking, registry |
| Feature Engineering | Custom Pipeline + DictVectorizer | Automated feature transformation |
| Category | Technology | Purpose |
|---|---|---|
| API Framework | FastAPI | High-performance REST API development |
| API Documentation | OpenAPI/Swagger | Automatic API documentation |
| Data Validation | Pydantic | Request/response schema validation |
| ASGI Server | Uvicorn | Production ASGI server |
| Category | Technology | Purpose |
|---|---|---|
| Containerization | Docker, Docker Compose | Application packaging and orchestration |
| CI/CD | GitHub Actions | Automated testing and deployment |
| Cloud Deployment | AWS Lambda, EC2 | Serverless and traditional hosting |
| Infrastructure | AWS CLI, Boto3 | Cloud resource management |
| Category | Technology | Purpose |
|---|---|---|
| Package Management | UV (Python) | Fast dependency management |
| Testing | PyTest | Unit and integration testing |
| Code Coverage | Codecov | Test coverage analysis and reporting |
| Code Formatting | Ruff | Fast Python linter and formatter |
| Security Scanning | Bandit, Safety | Static security analysis and vulnerability detection |
| Container Security | Trivy | Container image vulnerability scanning |
| Logging | Loguru | Structured application logging |
| Configuration | Pydantic Settings | Environment-based configuration |
| Code Quality | Type Hints, Dataclasses | Code maintainability and safety |
| Category | Technology | Purpose |
|---|---|---|
| Metrics Collection | Prometheus | Scrapes and stores time-series metrics (request rate, latency, errors) |
| Visualization | Grafana | Auto-provisioned dashboards: API Health + Model Performance |
| Alerting | Prometheus Alert Rules | 5 rules — high error rate, p95 latency, service down, prediction errors, duration drift |
| Drift Detection | Evidently | Compares production input distributions against training data, HTML report |
| Error Tracking | Structured logging (Loguru) | Production error monitoring with rotation |
| Experiment Tracking | MLflow | Model performance and versioning |
- Python 3.12+
- Docker & Docker Compose
- UV package manager — install here
git clone https://github.com/AhmadHammad21/Taxi-Duration-Prediction.git
cd Taxi-Duration-Prediction
uv sync# Downloads NYC TLC data, runs feature engineering, trains models, logs to MLflow
uv run python -m src.mainTrained model artifact saved to src/artifacts/. MLflow experiments visible at http://localhost:5000 (after step 3).
docker-compose up --build| Service | URL |
|---|---|
| FastAPI + Swagger | http://localhost:8000/docs |
| MLflow UI | http://localhost:5000 |
| Prometheus | http://localhost:9090/alerts |
| Grafana (admin/admin) | http://localhost:3000 |
curl -X POST http://localhost:8000/api/v1/predict \
-H "Content-Type: application/json" \
-d '{"PULocationID": "132", "DOLocationID": "161"}'# After sending 50+ requests to /predict:
uv run python -m src.monitoring.drift_report
# Report saved to reports/drift_report.htmlTo stop all services:
docker-compose downUse Case: Full control, persistent MLflow server, easier debugging
docker build -t taxi-prediction-api .
docker run -p 8000:8000 taxi-prediction-apiSee DEPLOYMENT.md for full EC2 setup with security groups and GitHub Actions wiring.
Use Case: Variable traffic, cost optimization See DEPLOYMENT.md for Lambda + ECR deployment instructions.
This project demonstrates production-ready MLOps practices with automated workflows supporting multiple deployment strategies:
- Trigger: Push to
mainbranch - Pipeline: Build → Test → Deploy → Monitor
- Target: High-throughput production workloads
- Trigger: Automated on code changes
- Pipeline: Package → Deploy → Scale → Monitor
- Target: Cost-optimized, variable workloads
- Model versioning and lineage tracking
- A/B testing capabilities
- Performance monitoring and drift detection
- Auto-generated OpenAPI documentation
- Request/response validation
- Real-time performance metrics
- Live predictions/s, total predictions, avg predicted duration
- Predicted duration distribution over time
- Auto-provisioned on
docker-compose up— no manual setup
- Compares production input distributions against training data
- Per-feature drift scores using Wasserstein distance
- Flags when the model is seeing data it wasn't trained on
Built following software engineering best practices and MLOps principles for scalability and maintainability:
taxi-duration-prediction/
├── src/ # 💻 Core MLOps Platform
│ ├── config/ # ⚙️ Centralized Configuration + prometheus.yml
│ ├── data_pulling/ # 📊 Data Engineering Pipeline
│ ├── features/ # 🔧 Feature Engineering & Preprocessing
│ ├── training/ # 🎯 ML Model Training & Evaluation
│ ├── inference/ # 🚀 Production Inference Engine
│ ├── monitoring/ # 📈 Drift Detection & Prediction Logger
│ ├── routes/ # 🌐 RESTful API Endpoints
│ ├── schemas/ # 📝 Data Validation & Type Safety
│ ├── metrics.py # 📊 Centralized Prometheus Metrics Registry
│ └── utils/ # 🔧 Shared Utilities & Helpers
├── grafana/
│ ├── provisioning/ # 🔌 Auto-provisioned datasource & dashboard config
│ └── dashboards/ # 📊 API Health + Model Performance JSON dashboards
├── prometheus/
│ └── alerts.yml # 🚨 Alert rules (error rate, latency, service down, drift)
├── tests/ # ✅ Comprehensive Test Suite
├── .github/workflows/ # 🔄 CI/CD Automation
├── docker-compose.yml # 🐳 Multi-Service Orchestration (FastAPI, MLflow, Prometheus, Grafana)
└── pyproject.toml # 📦 Modern Dependency Management (uv)
- Microservices Architecture: Loosely coupled, independently deployable components
- Configuration Management: Centralized settings for multi-environment deployment
- API-First Design: RESTful interfaces with comprehensive documentation
- Test-Driven Development: Unit, integration, and end-to-end testing
- Infrastructure as Code: Reproducible deployments across environments
- Data Pipeline: Automated download and ingestion from NYC TLC
- Feature Engineering: Preprocessing and transformation pipeline
- ML Training Pipeline: Multi-model training with MLflow experiment tracking
- Inference Engine: Production-ready prediction service
- REST API: FastAPI with Swagger documentation
- Quality Assurance: Unit, integration, and performance tests (Locust)
- Logging Infrastructure: Structured logging with Loguru
- CI/CD Automation: GitHub Actions — Lambda + EC2 workflows, security scanning
- Containerization: Docker and Docker Compose
- Cloud Deployment: EC2 and AWS Lambda serverless options
- Monitoring Stack: Prometheus + Grafana with auto-provisioned dashboards
- Alerting Rules: High error rate, latency p95, service down, prediction errors
- Data Drift Detection: Evidently reports comparing production inputs vs training data
- Data Version Control: DVC for data lineage and reproducibility
- Automated Retraining: Drift-triggered scheduled retraining pipeline
- A/B Testing Framework: Canary deployments and traffic splitting
- Container Orchestration: ECS + Fargate or Kubernetes
- Feature Store: Centralized feature management (Feast)
- Model Explainability: SHAP/LIME integration
- Infrastructure as Code: Terraform for AWS resources
Data Source: NYC Taxi & Limousine Commission Trip Record Data
License: MIT License - see LICENSE file for details
Usage: Educational and demonstration purposes showcasing MLOps capabilities
This project demonstrates comprehensive MLOps expertise suitable for enterprise-scale machine learning operations and production deployment scenarios.






