Skip to content

shafayatsaad/DriftGuard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛡️ DriftGuard

Autonomous Model Drift Detection & MLOps Recovery System

🇬🇧 English | 🇯🇵 日本語


React TypeScript Python Flask Tailwind License

DriftGuard is an enterprise-grade MLOps dashboard designed to monitor, detect, and remediate machine learning model degradation in production using advanced statistical methods like Population Stability Index (PSI), Kolmogorov-Smirnov testing, and Kullback-Leibler Divergence. It replaces opaque model failure with quantifiable health metrics and automated retraining strategies.

Report Bug · Request Feature


💡 Project Concept

In production environments, ML models don't fail with an error stack trace; they fail silently as data distributions shift (Data Drift) or relationships change (Concept Drift).

DriftGuard solves this by providing a continuous monitoring layer that:

  1. Quantifies Drift: Uses statistical methods like Population Stability Index (PSI) to measure distribution shifts.
  2. Visualizes Impact: Correlates drift scores with estimated accuracy drops.
  3. Prescribes Action: Automates the cost-benefit analysis of retraining models versus letting them run.

Core MLOps Principles

  • Observability First: Dashboard-centric view of model health ($ Health Score).
  • Statistical Rigor: Reliance on proven metrics (PSI, Kolmogorov-Smirnov test, KL Divergence) rather than simple distinct counts.
  • Actionable Insights: Recommendations are linked to business value (Revenue at Risk vs. Retraining Cost).

🚀 Key Features

📊 Live Drift Monitoring

  • Real-time Health Score: A composite metric (0-100) derived from drift severity across all features.
  • Dynamic Metrics: Tracks "Total Predictions", "Average Drift Score", and "Estimated Accuracy" live.
  • Feature-Level Diagnostics: Identifies exactly which features (e.g., Income, Age, Debt Ratio) are causing the model to degrade.
  • Deep Dive Analysis: Drill down into specific features (click "Analyze") to view histograms, PSI/KS/KL metrics, and descriptive statistics comparing training vs. production data.

🔐 Secure Access

  • Authentication: Secure Login and Signup flow with JWT-ready structure (currently demo mode).
  • Role-Based Access: Foundations for Admin vs. Viewer roles.

🧠 Intelligent Retraining Recommendations

  • Cost-Benefit Engine: Automatically calculates whether it is profitable to retrain the model based on current revenue loss vs. compute costs.
  • Automated Scheduling: One-click scheduling for retraining jobs when thresholds are breached.

⚡ Forecasting & Trends

  • Historical Analysis: View drift trends over 30/60/90 days to identify slow-burning degradation.
  • Interactive Reports: Export comprehensive drift reports (.csv, .pdf) for compliance and auditing.

🔔 Smart Alerting

  • Configurable Rules: Set conditional alerts (e.g., "If Income PSI > 0.2 for 6 hours").
  • Multi-Channel Notification: Integration logic for Slack, Email, and PagerDuty (simulated).

💻 Code Spotlight

DriftGuard uses multiple statistical methods to detect distributional shifts: Population Stability Index (PSI), Kolmogorov-Smirnov (KS) test, and Kullback-Leibler (KL) Divergence. Here is a snippet of the detection logic from the backend:

# backend/drift_detection.py

def calculate_psi(expected_array, actual_array, buckets=10, bucket_type='quantiles'):
    # Calculates Population Stability Index (PSI) to measure data drift.
    # PSI < 0.1: No significant drift
    # PSI < 0.2: Moderate drift
    # PSI >= 0.2: Significant drift

def calculate_ks(expected_array, actual_array):
    # Calculates Kolmogorov-Smirnov statistic for distribution comparison

def calculate_kl(expected_array, actual_array, buckets=10, bucket_type='quantiles'):
    # Calculates Kullback-Leibler Divergence for distribution shift measurement
    input += (1e-6)  # Avoid division by zero
    interp = np.interp(input, (min, max), (0, 1))
    return interp

breakpoints = np.arange(0, buckets + 1) / (buckets) * 100
# ... logic to calculate proportions ...

psi_value = np.sum((actual_prop - expected_prop) * np.log(actual_prop / expected_prop))
return psi_value

---

## 🏗️ Demo Scenarios

The project includes a robust simulation engine `demo_scenarios.py` to demonstrate various production states:

| Scenario | Description | Effect |
| :--- | :--- | :--- |
| **1. Baseline (Healthy)** | Normal distribution matching training data. | Health Score: ~98/100 |
| **2. Sudden Drift (Attack)** | Simulates a sudden shift in high-importance features. | Health Score: ~45/100 |
| **3. Gradual Decay** | Slowly introduces noise over time. | Health Score: ~80/100 |

Run a scenario using:
```bash
python demo_scenarios.py --scenario 2

📁 Project Structure

DriftGuard/
├── backend/            # Flask API & ML Logic
│   ├── app.py          # Main application entry point
│   ├── drift_detection.py # Core math for PSI/Drift
│   ├── data_generator.py # Synthetic data generation
│   └── service.py      # Business logic layer
├── frontend/           # React Dashboard
│   ├── src/
│   │   ├── components/ # Dashboard, Trends, Alerts, etc.
│   │   ├── App.tsx     # Main routing & layout
│   │   └── types.ts    # TypeScript definitions
│   └── tailwind.config.js
├── data/               # Local CSV storage for demo
└── demo_scenarios.py   # CLI tool for drift simulation

🏁 Getting Started

Prerequisites

  • Python 3.9+
  • Node.js 16+

1. Backend Setup

cd backend
# Create virtual environment (optional)
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows

# Install dependencies (ensure pandas, numpy, flask are installed)
pip install flask flask-cors pandas numpy

# Run the API
python app.py

Server runs on http://localhost:5000

2. Frontend Setup

cd frontend

# Install dependencies
npm install

# Start Development Server
npm run dev

Dashboard runs on http://localhost:5173


🤝 Contributing

Contributions to improve drift detection algorithms or add new visualization widgets are welcome.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/NewMetric)
  3. Commit your Changes (git commit -m 'Add KL Divergence metric')
  4. Push to the Branch (git push origin feature/NewMetric)
  5. Open a Pull Request

📄 License

Distributed under the MIT License. See LICENSE for more information.


👤 Maintainer

Shafayat Saad - MLOps Engineer

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published