Skip to content

selvet-elif/Credit_Approval_ML_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Credit Card Approval ML Project

This repository hosts a Jupyter Notebook (credit-approval-ml-project.ipynb) that predicts credit card application approvals using machine learning.

πŸ“‹ Table of Contents

πŸ” Project Description

This project classifies credit card applications as "good" or "bad" based on applicant information and credit history. It covers:

  • Data Preprocessing: Handling missing values and feature engineering (e.g., Age, WorkingYears).
  • Imbalanced Data Handling: Balancing classes using SMOTE.
  • Modeling: Training a baseline Random Forest classifier.
  • Hyperparameter Optimization: Tuning model parameters with Optuna.
  • Evaluation: Measuring performance using Accuracy, Precision, Recall, and F1-score.

πŸ“‚ Dataset

The dataset consists of two CSV files, which should be merged on the applicant ID:

  1. application_record.csv
    • Demographic and socio-economic features (e.g., CODE_GENDER, AMT_INCOME_TOTAL, NAME_EDUCATION_TYPE).
  2. credit_record.csv
    • Monthly repayment status (MONTHS_BALANCE) and payment status codes.

βš™οΈ Prerequisites

  • Python 3.7+
  • Jupyter Notebook or JupyterLab
  • Required Python libraries:
    • pandas
    • numpy
    • scikit-learn
    • imbalanced-learn
    • matplotlib
    • seaborn
    • optuna

πŸš€ Installation

  1. Clone the repository:
    git clone https://github.com/<username>/credit-approval-ml-project.git
  2. Install dependencies:
    pip install -r requirements.txt

▢️ Usage

  1. Launch Jupyter Notebook:
    jupyter notebook
  2. Open credit-approval-ml-project.ipynb.
  3. Execute cells in order to follow data preprocessing, model training, and evaluation.

πŸ“ Notebook Structure

  1. Data Preprocessing
    • Load and merge CSV files
    • Analyze and clean missing values
    • Feature engineering (Age, WorkingYears)
  2. Imbalanced Data Handling
    • Encode categorical variables
    • Apply SMOTE for oversampling
  3. Modeling
    • Split data into training and testing sets
    • Train a baseline Random Forest model
    • Optimize hyperparameters with Optuna
  4. Evaluation
    • Compare models and report performance metrics

πŸ“Š Results Summary

Below is the performance table for the Random Forest model optimized with Optuna:

precision recall f1-score support
0 0.97 0.98 0.98 85286
1 0.98 0.97 0.98 85110
accuracy 0.98 170396
macro avg 0.98 0.98 0.98 170396
weighted avg 0.98 0.98 0.98 170396

🀝 Contributing

  1. Fork this repository.
  2. Create a new branch:
    git checkout -b feature/your-feature
  3. Commit your changes:
    git commit -m "Add some feature"
  4. Push to your branch:
    git push origin feature/your-feature
  5. Open a Pull Request.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors