A machine learning–powered project for detecting malware files and malicious URLs.
This repository leverages Random Forest and Logistic Regression classifiers to accurately identify malicious patterns in files and web URLs.
📖 📖 Blog Reference: Machine Learning for Malware Detection
Traditional signature-based antivirus software struggles with:
- Polymorphic malware that changes signatures
- New malware with no known signatures
- Automated attacks by low-skilled attackers
Machine learning allows detection based on behavior and features, even for previously unseen malware.
- Detect malware in executable files using PE Header analysis
- Detect malicious URLs using text-based ML methods
- Provide a terminal-based CLI interface for scanning
- Enable future enhancements like GUI and web integration
- PE Header Dataset: Kaggle - Malware PE Dataset
- Malicious URLs Dataset: Provided within the repo
Workflow:
- Extract features from PE headers using
pefile - Train Random Forest Classifier for file malware detection
- Clean & vectorize URLs using TF-IDF
- Train Logistic Regression for malicious URL detection
- Apply whitelist filtering to avoid false positives
| Model | Accuracy | Precision | Recall |
|---|---|---|---|
| Random Forest (PE Headers) | 99.37% | 99.20% | 98.90% |
| Logistic Regression (URLs) | 98.46% | 99.18% | 96.25% |
Confusion Matrix PE Header Detector

Install dependencies:
pip install -r requirements.txt-
scikit-learn
-
pandas
-
numpy
-
pefile
-
joblib
-
pyfiglet (for ASCII art in CLI)
git clone https://github.com/HARSH74561/Malware-Detection-using-Machine-learning.git cd Malware-Detection-using-ML pip install -r requirements.txt python main.py
git clone https://github.com/HARSH74561/Malware-Detection-using-Machine-learning.git cd Malware-Detection-using-ML docker build -t py-md . docker run -ti py-md
-
Terminal-based interface
-
ASCII art on startup
-
Easy input for files and URLs
-
Expand dataset for higher accuracy
-
Create a GUI for Windows/Linux
-
Enable real-time file scanning
-
Add web-based interface for file/URL scanning
##🤝 Contributing
- Fork the repo & create a branch:
git checkout -b feature-branch- Make your changes and commit:
git add .
git commit -m "Add new feature"- Push and open a Pull Request:
git push origin feature-branchDeveloped by Harsh (GitHub: HARSH74561) Focused on cybersecurity, machine learning, and AI-powered tools.





