Loan Prediction Approval — MLOps Project

ENSAE Paris — Mise en production course

End-to-end MLOps pipeline for predicting loan approval:

data processing
hyperparameter tuning across three model families
MLflow experiment tracking, FastAPI deployment
Kubernetes orchestration on SSPCloud
GitOps automation via ArgoCD
Prometheus/Grafana monitoring
SHAP explanations
drift-triggered automatic retraining.

Live URLs

Service	URL pattern
Web UI	`https://loan-api-user-kbourbon.user.lab.sspcloud.fr`
Swagger UI	`https://loan-api-user-kbourbon.user.lab.sspcloud.fr/docs`
Prometheus metrics	`https://loan-api-user-kbourbon.user.lab.sspcloud.fr/metrics`
Grafana dashboard	`https://grafana-loan-user-kbourbon.user.lab.sspcloud.fr` (admin / admin)

Your username is the prefix of your SSPCloud namespace (e.g. namespace user-johndoe → username johndoe).

Reproduce from scratch

This section is for anyone who wants to clone the repo and get the exact same results.

Prerequisites

Tool	Version	Install
Python	3.13	https://www.python.org/downloads/
uv	latest	`curl -LsSf https://astral.sh/uv/install.sh \| sh`
Docker + Docker Compose	any recent	https://docs.docker.com/get-docker/
git	any	—
Access to Mlflow service on SSPCloud	—	—
Access to a MinIO S3 bucket on SSPCloud to store the data	—	—

Optional (Kubernetes deployment only):

kubectl configured against an SSPCloud cluster

Step 0 - Pre-requisite services

Open a Mlflow service (on SSPCloud for example) and copy somewhere the following variables, that you can find during the creation of the service:

MLFLOW_TRACKING_USERNAME
MLFLOW_TRACKING_PASSWORD
MLFLOW_TRACKING_URI

MLFLOW_TRACKING_URI corresponds to the http link proposed during the creation of the service

Step 1 — Clone and install

git clone https://github.com/kellybourbon2/Loan-prediction-approval.git
cd Loan-prediction-approval
uv sync                        # installs exact locked dependencies (uv.lock)
uv run pre-commit install      # enables ruff lint+format on every commit

uv sync reads uv.lock — every dependency is pinned, so you get the identical environment.

Step 2 — Dataset

The model trains on the Kaggle Playground Series S4E10 — Loan Approval Prediction dataset.

Download train.csv from https://www.kaggle.com/competitions/playground-series-s4e10/data
Upload it to your S3 bucket at the root: s3://username/<your-bucket>/train.csv

The data loader reads it directly from S3 at training time — no local copy needed.

Step 3 — Environment variables

cp .env.example .env

Edit .env with your credentials:

#S3 setting
AWS_ACCESS_KEY_ID=<your_key>
AWS_SECRET_ACCESS_KEY=<your_secret>
AWS_SESSION_TOKEN=<your_token>         # leave empty if not using SSPCloud temp tokens
AWS_S3_ENDPOINT=minio.lab.sspcloud.fr
AWS_BUCKET_NAME=<your_data_bucket>          # the path to the bucket where train.csv is stored 

#mlflow setting
MLFLOW_TRACKING_USERNAME=<your_mlflow_username>
MLFLOW_TRACKING_URI=<your_mlflow_tracking_uri>
MLFLOW_TRACKING_PASSWORD=<your_mlflow_password>

For the MLFLOW variables, put the ones you've copied in step 0.

These variables are loaded automatically by data_load.py via python-dotenv.

Step 4 — Train the model

uv run python src/main.py

What happens:

Step	Detail
Load	`train.csv` downloaded from S3
Preprocess	`DataPreprocessor`: drop unused columns, bin age, binary-encode credit default, StandardScaler + OneHotEncoder
Split	3-way: 80% training / 10% calibration / 10% evaluation — no leakage between the three
Tune	Hyperopt TPE search (`MAX_EVALS=10`) over XGBoost, CatBoost, RandomForest simultaneously — best model wins
Train	Best model retrained on full training set
Calibrate	`CalibratedClassifierCV(method='isotonic', cv='prefit')` fitted on the calibration split — well-calibrated probabilities
Evaluate	Accuracy, F1, Recall, Precision + confusion matrix on the eval split (never seen before)
Log	All metrics, params, confusion matrix PNG artifact → MLflow experiment `Loan Prediction Approval Experiments`
Register	Model registered in MLflow Registry as `@challenger`
Promote	Promoted to `@champion` only if F1 ≥ 0.5 and F1 > current champion (regression guard)

To inspect runs after training, you can open manually the link corresponding to your MLFLOW_TRACKING_URI variable.

You'll see all the metrics in Model Training > "Loan Approval Experiments"

Step 5 — Run the API locally

The API loads the @champion model from MLflow at startup.

uv run uvicorn src.api.app:app

By default, the API is deployed on the port 8000 of your local machine. You can see visualize the app by opening the following link: http://127.0.0.1:8000

You can also request the model directly. To do so, open a new bash terminal (without closing the former one) and paste:

curl -X POST http://localhost:8000/predict \
  -H "Content-Type: application/json" \
  -d '{
    "person_age": 30,
    "person_income": 60000,
    "person_home_ownership": "RENT",
    "person_emp_length": 5.0,
    "loan_intent": "PERSONAL",
    "loan_amnt": 10000,
    "loan_percent_income": 0.17,
    "cb_person_default_on_file": "N",
    "cb_person_cred_hist_length": 4
  }'
# → {"loan_status":1,"approved":true,"probability":0.9733}

Once the API requested, you can close the application by running "Ctrl + C" in the terminal where uvicorn is running.

Step 6 — Run the full stack locally (API + Prometheus + Grafana)

The docker-compose.yaml manifest can be used to run the full stack (API + Prometheus + Grafana) locally. This manifest allow Prometheus and Grafana images to be pulled and local API image to be built, to create three containers where the api, Grafana and Prometheus can live independantly.

Since there is no docker on SSPCloud, open a local VSCode with docker installed on it and run:

docker compose up

Open the following links to visualise each service:

Service	URL	Credentials
API	http://localhost:8000	—
Prometheus	http://localhost:9090	—
Grafana	http://localhost:3000	admin / admin

Step 7 — Run the tests

Once you successfully run the API, you can run the following test, in another terminal (while the API is still running):

INTEGRATION_API_URL=http://localhost:8000 \
  uv run pytest unit_tests/test_integration.py -v

File	Tests	What is covered
`test_preprocessing.py`	6	`DataPreprocessor`: clean, feature engineering, split, encoding
`test_api.py`	14	`/predict`, `/predict/batch`, `/explain` — mocked model
`test_integration.py`	8	Real HTTP calls — health, predict, batch, explain, metrics, no traceback leak

Repository structure

├── src/
│   ├── api/
│   │   ├── app.py            # FastAPI app (predict, batch, explain, health, metrics)
│   │   ├── schemas.py        # Pydantic input/output schemas
│   │   ├── metrics.py        # Prometheus metrics definitions
│   │   └── logger.py         # Structured prediction logger → S3 sync
│   ├── model/
│   │   ├── train.py          # Model training wrapper
│   │   ├── tune.py           # Hyperopt objective + model builder (with early stopping)
│   │   ├── evaluate.py       # Metrics + confusion matrix → MLflow
│   │   ├── registry.py       # MLflow registry: register, promote, load champion
│   │   └── search_space.py   # Hyperopt search space (XGBoost, CatBoost, RF)
│   ├── data_processing/
│   │   ├── preprocessing.py  # DataPreprocessor (clean → engineer → scale → encode)
│   │   └── data_load.py      # S3 data loading via s3fs
│   ├── main.py               # Full training entrypoint
│   └── drift_analysis.py     # KS test + PSI drift detection
├── .github/workflows/
│   ├── ci.yml                # Ruff + unit tests + integration tests + build and push api image
│   ├── retrain.yml           # Manual/scheduled retraining (every Monday 2am UTC) --> triggers ci 
│   └── drift_check.yml       # Daily drift check → triggers retrain if drift detected
│   
│
├── monitoring/
│   └── grafana/
│       ├── dashboards/       # Dashboard JSON (auto-provisioned)
│       └── provisioning/     # Datasources, dashboards, alerting rules
├── unit_tests/
├── Dockerfile
├── docker-compose.yml
├── pyproject.toml            # Python project + ruff config
├── uv.lock                   # Pinned dependency lockfile
└── config.py                 # All training constants (CV_FOLDS, MAX_EVALS, thresholds…)

API reference

Method	Path	Description
`GET`	`/` or `/ui/`	Web UI — loan assessment form
`GET`	`/health`	Returns 200 if model loaded, 503 otherwise
`POST`	`/predict`	Single loan prediction
`POST`	`/predict/batch`	Batch prediction (max 500 per request)
`POST`	`/explain`	SHAP feature contributions for one application
`GET`	`/metrics`	Prometheus metrics endpoint
`GET`	`/docs`	Swagger UI

SHAP explanations

SHAP values are computed without the external shap library (incompatible with Python 3.13):

XGBoost: get_booster().predict(dmat, pred_contribs=True)
CatBoost: get_feature_importance(type="ShapValues", data=pool)
RandomForest: global feature importances weighted by prediction deviation

Positive SHAP values push toward approval, negative toward rejection.

CI/CD

Workflow	Trigger	What it does
`ci.yml`	Push touching `src/`, `Dockerfile`, `pyproject.toml`, `uv.lock` on main branch	Run unit test + Build Docker image + push to Docker Hub
`retrain.yml`	Manual or every Monday 2am UTC	Full re-training of the model(run src/main.py) + MLflow registry update
`drift_check.yml`	Daily 8am UTC	Download `predictions.jsonl` from S3 → KS + PSI analysis → trigger `retrain.yml` if drift detected

NB: Note that the CD workflows (cd, retrained and drift_check) are performed by a GitHub Actions bot using the automatically generated GITHUB_TOKEN. We chose this approach to ensure durable deployment: even if a user account is removed from GitHub, deployments will still be handled by the bot.

Required GitHub Actions configuration

Go to Settings → Secrets and variables → Actions and add:

Name	Type	Value
`DOCKERHUB_TOKEN`	Secret	Docker Hub access token
`AWS_ACCESS_KEY_ID`	Secret	S3 credentials (for retrain + drift check)
`AWS_SECRET_ACCESS_KEY`	Secret	—
`AWS_SESSION_TOKEN`	Secret	—
`AWS_S3_ENDPOINT`	Secret	e.g. `minio.lab.sspcloud.fr`
`AWS_BUCKET_NAME`	Secret	—
`DOCKERHUB_USERNAME`	Variable	Docker Hub username
`API_URL`	Variable	Deployed API base URL — enables integration tests and post-deploy healthcheck

--

Make sure to have created a DOCKERHUB_TOKEN with "Read" scope (and that your docker image is public!!)

Kubernetes deployment

The deployment of the application is handled by a distinct GitOps repertory: https://github.com/kellybourbon2/Loan-prediction-approval-deployment

If you want to recreate the cluster kubernetes from scratch, you can download the folder deployment/ of this repertory and follow the next steps.

Warning: You can't orchestrate the kubernetes cluster if you have not chosen the role "Admin" during the creation of your SSPCloud VSCode service.

The goal here is to create three pods kubernetes to be able to run our api from any machine:

one pod building an environement from the official prometheus image pulled
one pod building an environement from the official grafana image pulled
one pod building an environement from our loan-api image pulled, that we've build and push earlier to the dockerhub.

Create a secret yaml manifest at the root of the project:

cp secret.example.yaml secret.yaml

Edit secret.yaml with your credentials. These are the same credentials than you enter to your .env file earlier. The secret will be named "loan-api-secret".

Give this secret to your cluster kubernetes

kubectl apply -f ./secret.yaml

3. Adapt the different manifests kubernetes in the folder "deployment" by changing all the occurence of user-kbourbon with your own kubernetes username. In the deployment.yaml, file, also change "kellybrbn/loan-api" with your own docker image path.

> Note that you can find your kubernetes username  in your environnement variables by running: 
```bash
env | grep ^KUBERNETES_NAMESPACE

4. Give the yaml manifests to the cluster kubernetes 

```bash
kubectl apply -f deployment/

You can monitor the pods by running:

kubectl get pods -w

If everything goes well, you should see three pods: one for loan-api, one for prometheus and one for grafana. When the three pods are the status: "Running 1/1", they're ready and the application should be exposed on : https://loan-api-kubernetes-username.user.lab.sspcloud.fr

Architecture of CI/CD

flowchart LR
    A[src/ change] -->|push to main| B[CI workflow]
    B --> C[lint + tests]
    C --> D[build Docker image]
    D -->|push sha tag| E[Docker Hub]
    D -->|PAT push| F[GitOps repo\ndeployment.yaml]
    F -->|detects change| G[ArgoCD]
    G -->|sync| H[Kubernetes cluster\nSSP Cloud]

    H -->|prediction logs| I[(S3 + MLflow)]
    I -->|daily| J[Drift check]
    J -->|drift detected| K[Retrain]
    K -->|triggers| B

Monitoring

Prometheus metrics

Metric	Type	Description
`loan_predictions_total{result}`	Counter	Approved / rejected counts
`loan_prediction_probability`	Histogram	Distribution of approval probabilities
`loan_prediction_errors_total`	Counter	Prediction errors
`loan_approval_rate`	Gauge	Rolling approval rate (last 100 predictions)
`loan_request_income`	Histogram	Applicant income (drift monitoring)
`loan_request_amount`	Histogram	Loan amount (drift monitoring)
`loan_request_lti_ratio`	Histogram	Loan-to-income ratio (drift monitoring)
`loan_batch_size`	Histogram	Batch request sizes

Grafana alerts (auto-provisioned)

Alert	Condition	Severity
High Prediction Error Rate	> 5 errors in 5 min	Critical
Abnormally Low Approval Rate	< 10% for 5 min	Warning
High API Latency	p95 > 2s for 3 min	Warning

Configuration reference

All constants are in config.py:

Variable	Default	Description
`CV_FOLDS`	5	Stratified K-Fold folds during hyperparameter search
`MAX_EVALS`	10	Hyperopt iterations (increase for better results, slower training)
`RANDOM_STATE`	42	Seed for all random operations — guarantees reproducibility
`TEST_SIZE`	0.2	Holdout fraction (split into calibration + eval)
`F1_PROMOTION_THRESHOLD`	0.5	Minimum F1 required to promote a challenger to @champion
`MLFLOW_MODEL_NAME`	`loan-approval-model`	Model name in the MLflow Registry
`MLFLOW_MODEL_NAME`	`Loan Prediction Approval Experiments`	Name of Mlflow experiment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Loan Prediction Approval — MLOps Project

Live URLs

Reproduce from scratch

Prerequisites

Step 0 - Pre-requisite services

Step 1 — Clone and install

Step 2 — Dataset

Step 3 — Environment variables

Step 4 — Train the model

Step 5 — Run the API locally

Step 6 — Run the full stack locally (API + Prometheus + Grafana)

Step 7 — Run the tests

Repository structure

API reference

SHAP explanations

CI/CD

Required GitHub Actions configuration

Kubernetes deployment

Architecture of CI/CD

Monitoring

Prometheus metrics

Grafana alerts (auto-provisioned)

Configuration reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 161 Commits
.github/workflows		.github/workflows
monitoring		monitoring
notebooks		notebooks
src		src
unit_tests		unit_tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
config.py		config.py
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
secret.example.yaml		secret.example.yaml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Loan Prediction Approval — MLOps Project

Live URLs

Reproduce from scratch

Prerequisites

Step 0 - Pre-requisite services

Step 1 — Clone and install

Step 2 — Dataset

Step 3 — Environment variables

Step 4 — Train the model

Step 5 — Run the API locally

Step 6 — Run the full stack locally (API + Prometheus + Grafana)

Step 7 — Run the tests

Repository structure

API reference

SHAP explanations

CI/CD

Required GitHub Actions configuration

Kubernetes deployment

Architecture of CI/CD

Monitoring

Prometheus metrics

Grafana alerts (auto-provisioned)

Configuration reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages