This application is designed to detect and prevent fraudulent activities in credit card rewards data. By leveraging machine learning techniques, this system analyzes transaction data to identify patterns and anomalies indicative of fraud. Also includes components for training the model, making predictions via an API, batch processing, and real-time processing using Kafka.
- System Architecture
- Project Structure
- Setup and Installation
- Model Training and Selection
- Usage
- Contributing
- License
Figure 1: System Workflow integrating Kafka, Dockerized models, and Flask APIs for real-time fraud detection.
Faud-detection-system/
├── api/
│ ├── app.py
│ ├── kafka_consumer.py
│ ├── model.pkl
│ ├── scaler.pkl
│ ├── requirements.txt
│ ├── Dockerfile.api
├── client/
│ ├── kafka_producer.py
│ ├── client_app.py
│ ├── requirements.txt
│ ├── new_transactions.csv
│ ├── Dockerfile.client
├── frontend/
│ ├── index.html
│ ├── styles.css
│ ├── script.js
│ ├── Dockerfile.frontend
├── dataset/
│ ├── creditcard_2023.csv
├── README.md
├── docker-compose.yml
├── model_training.py
├── requirements.training.txt
├── nginx.conf
git clone https://github.com/shipra-aeron/Fraud-Detection-System.git
cd Fraud-Detection-System- Python 3.8+
- Docker and Docker Compose
- Kafka
For detailed steps to run the application locally without Docker, please refer to README_LOCAL.md.
For detailed steps to run the application using Docker, please refer to README_DOCKER.md.
For an in-depth explanation of the thought process behind model training and selection, please refer to MODEL_TRAINING_AND_SELECTION.md.
The Flask API provides several endpoints for making predictions and managing transactions.
POST /predict
- Description: Accepts transaction data and returns a prediction.
- Sample Request:
{ "transaction_id": "12345", "amount": 200.0, "time": 56789, "v1": 0.5, "v2": -1.2, "v3": 0.8 } - Sample Response:
{ "prediction": "fraud", "confidence": 0.92 }
You can process batches of transactions by running the client_app.py script. This script reads transactions from new_transactions.csv and sends them to the API for predictions.
The system uses Kafka for real-time processing of transaction data. The Kafka producer (kafka_producer.py) sends messages to a Kafka topic, and the Kafka consumer (kafka_consumer.py) processes these messages and interacts with the API.
The frontend interface allows users to interact with the system via a web browser. The frontend is served using a simple HTTP server and communicates with the API for predictions.
Contributions are welcome! Please submit a pull request or open an issue to discuss any changes.
- Fork the repository.
- Create a new branch for your feature/bugfix.
- Submit a pull request.
- Unsupervised Learning: Explore clustering techniques for detecting novel fraud patterns.
- Geospatial Analysis: Integrate location data for enhanced fraud detection.
This project is licensed under the MIT License. See the LICENSE file for more details.