A professional, AI-powered Credit Risk Assessment tool designed for financial institutions. This application leverages AutoGluon for state-of-the-art tabular prediction and providers a modern, glassmorphic user interface built with Streamlit.
- About the Project
- Key Features
- Tech Stack
- Model Performance
- Installation & Setup
- Usage
- Docker Support
- Project Structure
Credit Risk AI streamlines the loan approval process by predicting the probability of default based on client data. It splits the workflow into a robust FastAPI backend for inference and a polished Streamlit frontend for user interaction.
Note: This repository includes the code to train the models but does not exclude pre-trained models due to their large size. You must run the training script locally before launching the application.
- State-of-the-Art AI: Uses AutoGluon's
WeightedEnsemble_L3for maximizing predictive accuracy. - Professional UI: A coherent finance-themed design with "Glassmorphism" aesthetics and dark mode.
- Real-time Risk Assessment: Instant probability calculations with visual gauge charts.
- PDF Reporting: Automatically generates downloadable PDF reports for each assessment.
- History & Analytics: Tracks past assessments and provides visual analytics dashboards.
- MLflow Integration: Tracks model training experiments and metrics (Production stage ready).
| Category | Technologies |
|---|---|
| Machine Learning | AutoGluon, Scikit-learn |
| Backend | FastAPI, Uvicorn |
| Frontend | Streamlit, Plotly |
| Experiment Tracking | MLflow |
| Monitoring | Prometheus, Grafana |
| Containerization | Docker, Docker Compose |
| Utilities | Pandas, NumPy, ReportLab (PDF) |
The model was trained on the UCI_Syncora_Synthetic.csv dataset. The following metrics were achieved using the best-performing model (WeightedEnsemble_L3):
| Metric | Score |
|---|---|
| Accuracy | 0.9254 |
| Precision | 0.8869 |
| Recall | 0.7544 |
| F1-Score | 0.8153 |
| ROC-AUC | 0.9528 |
Experience the modern glassmorphic interface designed for financial professionals.
Input client data for instant prediction
Visualize risk distribution and trends
Track past assessments and status
Real-time system health metrics
Download sample PDF reports generated by the system:
-
COMPREHENSIVE RISK ASSESSMENT HISTORY - A complete portfolio overview including executive summary, risk distribution, and detailed logs of all past assessments.
-
CREDIT RISK ASSESSMENT REPORT - A single-client detailed report with risk probability, expected loss calculation, and SHAP-based decision explanation.
-
Clone the Repository
git clone https://github.com/your-username/CreditRiskAI.git cd CreditRiskAI -
Install Dependencies
pip install -r requirements.txt
-
Train the Model (Crucial Step) Since models are not stored in the repo, you must generate them locally:
python src/credit_risk_autogluon.py
This will create a
models/autogluon_models/directory containing the trained artifacts.
To run the full application, you need to start both the backend API and the frontend interface.
1. Start the API Server
uvicorn src.api:app --host 127.0.0.1 --port 8000The API will be available at http://127.0.0.1:8000
2. Start the Frontend App Open a new terminal and run:
streamlit run src/app.pyYou can also run the entire application stack using Docker.
-
Build and Run
docker-compose up --build
This command will start:
- FastAPI API:
http://localhost:8000 - Streamlit Frontend:
http://localhost:8501 - Grafana Dashboard:
http://localhost:3000 - MLflow Server:
http://localhost:5000 - Prometheus:
http://localhost:9090
Note: The local directory is mounted to the container, so code changes will be reflected immediately (hot-reload enabled).
- FastAPI API:
CreditRiskAI/
βββ src/ # Python source code (API, App, Training)
βββ data/ # Datasets and local storage
βββ models/ # Trained model artifacts
βββ config/ # Monitoring configurations (Prometheus)
βββ tests/ # Unit and integration tests
βββ outputs/ # Generated metrics and visualizations
βββ Dockerfile # Container definition
βββ docker-compose.yml # Multi-service orchestration
βββ requirements.txt # Python dependencies
This project is built with production-grade monitoring and experiment tracking.
MLflow serves as the central nervous system for our model development. Instead of wondering which parameters led to which result, MLflow provides a reproducible timeline of every experiment.
- Automated Logging: Every time you run the training script, MLflow captures the exact version of the code, data, and hyperparameters used.
- Comparative Analysis: You can visualize how the ROC-AUC improves across different runs or "presets".
- Model Governance: The registry allows us to tag specific models as "Production" or "Staging", enabling safe and structured deployment cycles.
Access the MLflow UI at
http://localhost:5000to explore the "Science" behind the predictions.
In a real-world scenario, a model's prediction is only useful if the service is up and healthy.
- Prometheus: Acts as a time-series database that "scrapes" our FastAPI service every few seconds. It tracks how many people are using the
/predictendpoint and how long the AI takes to respond. - Grafana: Transforms these raw numbers into a beautiful, high-level dashboard. It allows DevOps engineers to spot trends (like a sudden spike in high-risk predictions or API latency) before they become problems.
Access Prometheus at
http://localhost:9090and Grafana athttp://localhost:3000(Default login:admin/admin).
This project isn't just a simple script; it is a Full-Lifecycle AI System:
- AutoGluon handles the heavy-lifting of model selection, ensuring we always have the most accurate ensemble.
- FastAPI & Streamlit provide a high-performance, modern interface that is familiar to both data scientists and end-users.
- MLflow ensures scientific rigour and reproducibility.
- Prometheus & Grafana provide the industrial-strength monitoring required for enterprise-grade deployments.
- Docker guarantees that "it works on my machine" translates to "it works everywhere".
To deliver or present this project, follow this 3-step sequence:
- Step 1: The Brain: Run the training script to evaluate the dataset and save the best model.
python src/credit_risk_autogluon.py
- Step 2: The Body: Use Docker Compose to spin up the API, Frontend, and 3 monitoring services.
docker-compose up --build
- Step 3: The Dashboard: Open your browser and explore the ecosystem:
- Predict & Interaction: Streamlit Frontend
- Experiment Tracking: MLflow UI
- Operational Health: Grafana Dashboards
- API Exploration: FastAPI Interactive Docs
Endpoint: POST /predict
Payload Example:
{
"LIMIT_BAL": 50000,
"SEX": 1,
"EDUCATION": 2,
"MARRIAGE": 1,
"AGE": 30,
"PAY_0": 0,
"PAY_2": 0,
"PAY_3": 0,
"PAY_4": 0,
"PAY_5": 0,
"PAY_6": 0,
"BILL_AMT1": 5000,
"BILL_AMT2": 5000,
"BILL_AMT3": 5000,
"BILL_AMT4": 5000,
"BILL_AMT5": 5000,
"BILL_AMT6": 5000,
"PAY_AMT1": 2000,
"PAY_AMT2": 2000,
"PAY_AMT3": 2000,
"PAY_AMT4": 2000,
"PAY_AMT5": 2000,
"PAY_AMT6": 2000
}
