This repository hosts a Jupyter Notebook (credit-approval-ml-project.ipynb) that predicts credit card application approvals using machine learning.
- π Project Description
- π Dataset
- βοΈ Prerequisites
- π Installation
βΆοΈ Usage- π Notebook Structure
- π Results Summary
- π€ Contributing
This project classifies credit card applications as "good" or "bad" based on applicant information and credit history. It covers:
- Data Preprocessing: Handling missing values and feature engineering (e.g., Age, WorkingYears).
- Imbalanced Data Handling: Balancing classes using SMOTE.
- Modeling: Training a baseline Random Forest classifier.
- Hyperparameter Optimization: Tuning model parameters with Optuna.
- Evaluation: Measuring performance using Accuracy, Precision, Recall, and F1-score.
The dataset consists of two CSV files, which should be merged on the applicant ID:
application_record.csv- Demographic and socio-economic features (e.g., CODE_GENDER, AMT_INCOME_TOTAL, NAME_EDUCATION_TYPE).
credit_record.csv- Monthly repayment status (MONTHS_BALANCE) and payment status codes.
- Python 3.7+
- Jupyter Notebook or JupyterLab
- Required Python libraries:
- pandas
- numpy
- scikit-learn
- imbalanced-learn
- matplotlib
- seaborn
- optuna
- Clone the repository:
git clone https://github.com/<username>/credit-approval-ml-project.git
- Install dependencies:
pip install -r requirements.txt
- Launch Jupyter Notebook:
jupyter notebook
- Open
credit-approval-ml-project.ipynb. - Execute cells in order to follow data preprocessing, model training, and evaluation.
- Data Preprocessing
- Load and merge CSV files
- Analyze and clean missing values
- Feature engineering (Age, WorkingYears)
- Imbalanced Data Handling
- Encode categorical variables
- Apply SMOTE for oversampling
- Modeling
- Split data into training and testing sets
- Train a baseline Random Forest model
- Optimize hyperparameters with Optuna
- Evaluation
- Compare models and report performance metrics
Below is the performance table for the Random Forest model optimized with Optuna:
| precision | recall | f1-score | support | |
|---|---|---|---|---|
| 0 | 0.97 | 0.98 | 0.98 | 85286 |
| 1 | 0.98 | 0.97 | 0.98 | 85110 |
| accuracy | 0.98 | 170396 | ||
| macro avg | 0.98 | 0.98 | 0.98 | 170396 |
| weighted avg | 0.98 | 0.98 | 0.98 | 170396 |
- Fork this repository.
- Create a new branch:
git checkout -b feature/your-feature
- Commit your changes:
git commit -m "Add some feature" - Push to your branch:
git push origin feature/your-feature
- Open a Pull Request.