Skip to content

A machine learning project to detect fraudulent credit card transactions on an imbalanced dataset using models like Logistic Regression, SVM, and Random Forest, with a focus on high recall and precision for identifying fraud effectively.

Notifications You must be signed in to change notification settings

Gauri9977/Credit_Card_Fraud_Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Model Architecture or Output Sample

Credit Card Fraud Detection with Machine Learning

Fraudulent transactions cost billions annually. This project aims to accurately detect credit card fraud using machine learning techniques on a highly imbalanced dataset. With precision-driven modeling, robust evaluation, and insightful interpretation, this solution contributes toward safer digital transactions in the fintech world.


๐Ÿ” Overview

  • Dataset: Kaggle - Credit Card Fraud Detection
  • Goal: Identify fraudulent credit card transactions using supervised learning.
  • Total Records: 284,807 transactions, including 492 frauds (0.172%)
  • Core Focus: Tackling class imbalance, training robust models, and evaluating fraud detection effectiveness.

๐Ÿ“Œ Key Features

  • Thorough Exploratory Data Analysis (EDA)
  • Class imbalance handling using undersampling
  • Preprocessing pipeline for scalable feature transformation
  • Model building: Logistic Regression, Random Forest, and SVM
  • In-depth performance evaluation: Confusion Matrix, ROC Curve, AUC, Precision, Recall, F1-Score
  • Data visualization for insights and model interpretability

๐Ÿ› ๏ธ Tech Stack

Category Tools & Libraries
Language Python 3.x
Data Processing pandas, numpy
Visualization matplotlib, seaborn
ML Algorithms scikit-learn
Evaluation Tools classification_report, roc_auc_score

โš™๏ธ Workflow

1. Load and inspect dataset
2. Handle class imbalance (undersampling)
3. Preprocess data (scaling, feature analysis)
4. Train ML models (Logistic Regression, Random Forest, SVM)
5. Evaluate models using appropriate metrics
6. Visualize predictions and interpret model outputs

๐Ÿ“ˆ Performance Metrics

Metric Why It Matters
Accuracy Measures overall correctness
Precision Ensures fraud predictions are reliable
Recall Captures most actual frauds (critical metric)
F1-Score Balances precision and recall
ROC-AUC Measures discrimination capability at all thresholds

๐Ÿš€ Results Summary

  • High Recall achieved, critical for fraud detection
  • Applied undersampling to address class imbalance
  • Random Forest performed best in precision-recall balance
  • Clear, interpretable model evaluation with visual plots

๐Ÿ”ฎ Future Scope

  • Explore SMOTE/ADASYN for synthetic oversampling
  • Try advanced algorithms: XGBoost, LightGBM, CatBoost
  • Build and deploy a real-time detection system (API/Streamlit)
  • Design a live fraud alert dashboard for financial applications

๐Ÿ“‚ Dataset Reference

๐Ÿ“Ž Credit Card Fraud Detection Dataset on Kaggle


๐Ÿ’ฌ โ€œIn fraud detection, recall is king โ€” because missing a fraud costs more than a false alarm.โ€

About

A machine learning project to detect fraudulent credit card transactions on an imbalanced dataset using models like Logistic Regression, SVM, and Random Forest, with a focus on high recall and precision for identifying fraud effectively.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published