Skip to content

malek-harbaoui/Loan-Approval-Predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🏦 Loan Approval Predictor

Python Streamlit GitHub Stars

A professional Machine Learning system designed to predict bank loan approval decisions, featuring an interactive web interface built with Streamlit.


🎯 Project Objective

The goal of this project is to build an intelligent system capable of automatically predicting whether a loan application will be approved or rejected, based on:

  • Demographic information (age)
  • Financial data (annual income, requested loan amount, monthly deductions)
  • Product type and decision process
  • Historical banking decisions

This project simulates a real-world retail banking use case with production-oriented practices.


✨ Key Features

πŸ€– Machine Learning

  • 6 machine learning algorithms implemented and compared
    • Decision Tree
    • Random Forest
    • Extra Trees
    • XGBoost
    • LightGBM
    • CatBoost
  • Automated hyperparameter optimization
  • Stratified cross-validation
  • Comprehensive evaluation metrics:
    • Accuracy
    • Precision
    • Recall
    • F1-score
    • AUC-ROC

πŸ“Š Data Analysis & Feature Engineering

  • Automated Exploratory Data Analysis (EDA)
  • Advanced visualizations (20+ charts)
  • Feature engineering:
    • Financial ratios
    • Feature interactions
  • Outlier detection and data cleaning

🌐 Web Application (Streamlit)

  • Interactive dashboard for data exploration
  • Real-time loan approval prediction
  • Dynamic visualizations using Plotly
  • One-click model comparison

πŸ› οΈ Software Engineering

  • Modular and maintainable architecture
  • Unit testing with pytest
  • Fully documented codebase
  • Optimized VS Code configuration

πŸ“¦ Installation

Prerequisites

  • Python 3.12+
  • pip
  • Git (optional)

Quick Setup

# 1. Clone the repository
git clone https://github.com/malek-harbaoui/Loan-Approval-Predictor.git
cd Loan_Approval_Predictor

# 2. Create a virtual environment
python -m venv venv

# 3. Activate the environment
# Windows
venv\Scripts\activate
# Linux / macOS
source venv/bin/activate

# 4. Install dependencies
pip install -r requirements.txt

# 5. Verify installation
python -c "import pandas, sklearn, xgboost; print('Installation OK')"

πŸš€ Usage

πŸ“Š Full ML Pipeline

# 1. Exploratory Data Analysis
python scripts/run_eda.py

# 2. Data preprocessing & feature engineering
python scripts/run_preprocessing.py

# 3. Train machine learning models
python scripts/train_models.py

# 4. Generate performance reports
python scripts/generate_report.py

🌐 Launch the Web Application

streamlit run app.py

Then open your browser at: πŸ‘‰ http://localhost:8501

🎯 Programmatic Usage

from src.data.data_loader import DataLoader
from src.models.boosting_models import XGBoostModel
from sklearn.model_selection import train_test_split

# Load data
loader = DataLoader()
df = loader.load_retail_data()

# Prepare features and target
X = df.drop("DΓ©cision Finale Binaire", axis=1)
y = df["DΓ©cision Finale Binaire"]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Train model
model = XGBoostModel()
model.build_model(n_estimators=200, learning_rate=0.3)
model.fit(X_train, y_train)

# Predict
predictions = model.predict(X_test)
probabilities = model.predict_proba(X_test)

πŸ“ Project Structure

Loan_Approval_Predictor/
β”‚
β”œβ”€β”€ app.py                    # 🌐 Streamlit application
β”œβ”€β”€ requirements.txt          # πŸ“¦ Dependencies
β”œβ”€β”€ config.yaml               # βš™οΈ Global configuration
β”œβ”€β”€ README.md                 # πŸ“– Documentation
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/                  # Raw datasets
β”‚   └── processed/            # Cleaned datasets
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ data/                 # Data processing
β”‚   β”œβ”€β”€ models/               # ML models
β”‚   β”œβ”€β”€ visualization/        # Charts & plots
β”‚   └── evaluation/           # Metrics & evaluation
β”‚
β”œβ”€β”€ scripts/                  # Execution scripts
β”œβ”€β”€ notebooks/                # Jupyter notebooks
β”œβ”€β”€ tests/                    # Unit tests
β”œβ”€β”€ models/                   # Saved models
└── reports/                  # Results & figures


πŸ“Š Expected Results

Model Performance

Model Accuracy Precision Recall F1-Score AUC
XGBoost 0.924 0.918 0.931 0.924 0.957
LightGBM 0.921 0.915 0.928 0.921 0.954
CatBoost 0.918 0.912 0.925 0.918 0.951
Random Forest 0.915 0.909 0.922 0.915 0.948

πŸ” Most Important Features

  1. Retenus Mensuel - 35.2%
  2. Revenus Annuel - 28.7%
  3. Montant SollicitΓ© - 18.4%
  4. Age - 9.3%
  5. Type CANEVAS - 4.8%

🎨 Streamlit Interface

Available Pages

  1. 🏠 Home

    • Dataset overview
    • Key statistics
    • Model performance indicators
  2. πŸ“Š Data Exploration

    • Feature distributions
    • Bivariate analysis
    • Correlation matrix
    • Interactive visualizations
  3. 🎯 Prediction

    • User input form
    • Real-time loan approval prediction
    • Decision analysis
    • Explanation of influencing factors
  4. πŸ“ˆ Model Results

    • Model performance comparison
    • Performance charts
    • Detailed evaluation metrics
    • Result visualizations

πŸ› οΈ Technologies Used

Core ML

  • scikit-learn
  • XGBoost
  • LightGBM
  • CatBoost

Data Processing

  • pandas
  • numpy

Visualization

  • matplotlib
  • seaborn
  • plotly

Web Application

  • Streamlit
  • Streamlit-Plotly

Development Tools

  • pytest
  • black
  • flake8

πŸ”§ Global Configuration (config.yaml)

project:
  name: "Loan Approval Predictor"
  version: "1.0.0"
  random_seed: 42

data:
  test_size: 0.2
  target_column: "DΓ©cision Finale Binaire"

models:
  xgboost:
    n_estimators: 200
    learning_rate: 0.3
    max_depth: 4

πŸ” Environment Variables

Create a .env file at the project root:

DATA_PATH=data/raw
MODEL_PATH=models
RANDOM_STATE=42

πŸ“Š Evaluation Metrics

Le projet utilise plusieurs métriques pour évaluer les modèles :

  • Accuracy – Overall correctness of predictions
  • Precision – Proportion of correct positive predictions
  • Recall - Ability to detect positive cases
  • F1-Score - Harmonic mean of Precision and Recall
  • AUC-ROC - Area under the ROC curve
  • Specificity - Ability to detect negative cases

🀝 Contributing

Contributions are welcome! To contribute to this project:

  1. Fork the repository
  2. Create a new branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ‘₯ Auteur

  • Malek Harbaoui - Main Developer

⭐ If this project helped you, please consider giving it a star!

About

ML system predicting loan approvals with 95%+ accuracy. Features 6 algorithms, Streamlit UI, and comprehensive data analysis.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages