Malware detection full stack project for the users
A full-stack malware classification project using Machine Learning & Modern Web Technologies.
This project is designed to detect malicious vs. benign files using machine learning models trained on a custom malware dataset.
It also includes a frontend UI and backend API so users can upload files and receive predictions in real-time.
This repository contains:
- ✔️ Machine Learning model for malware detection
- ✔️ Full dataset (CSV format)
- ✔️ Training script (
train.py) - ✔️ Backend API server
- ✔️ Frontend interface
- ✔️ Confusion matrix and evaluation results
- ✔️ Setup scripts and project structure
- Binary classification (Malware vs Benign)
- Custom dataset with expanded features
- Machine Learning pipeline (preprocessing → training → prediction)
- Visualization of evaluation metrics
- REST API backend for prediction
- React/Node frontend for file upload + result display
- 100% open-source
Model used:
- XGBoostClassifier
- StandardScaler for normalization
- 80/20 train-test split
- Evaluation metrics:
- Accuracy
- Precision / Recall / F1-score
- Confusion matrix (saved as
confusion_matrix.png)
Training script:
Malware_detection/ │ ├── backend/ # Backend API server ├── malware-frontend/ # Frontend application ├── confusion_matrix.png # Model performance plot ├── malware_dataset_expanded.csv ├── malware_dataset_gmm_5000.csv │ ├── train.py # Model training script ├── run.sh # Script to run backend + frontend ├── package.json # Node project file ├── .gitignore └── README.md
git clone https://github.com/Harry-Khatri/Malware_detection.git
cd Malware_detection
For Model Training
python3 train.py
cd backend
npm install
npm start
cd malware-frontend
npm install
npm run dev