Skip to content

Rajwan-Sultan/Malware-Detection-System

Repository files navigation

🔍 PE Malware Detector

Analyze executable files with a powerful, AI-driven toolkit! Upload .exe or .dll files to get detailed reports, interactive chat, and comprehensive malware insights.

Website : https://malware-detection-system-eri5bvbecmkv63vrs6labt.streamlit.app/

🌟 What Makes This Special?

Feature Description
🧠 AI-Powered Analysis User-friendly reports powered by OpenAI’s gpt-4o-mini
📄 Executable Support Analyze .exe and .dll files with ease
💬 Seamless Chat Ask questions in a ChatGPT-like interface
⚡ Multi-Module Design Feature exploration, file comparison, and threat summaries
🔒 Secure Storage Save analyses locally in llm_analyses.json
🌐 Antivirus Integration Real-time antivirus scan results
🤖 ML Predictions Classify files as malicious or legitimate

🚀 Quick Start

1️⃣ Get Your API Keys

  • OpenAI API Key for AI narratives and chat
  • Antivirus API Key for antivirus scans

2️⃣ Install Dependencies

pip install -r requirements.txt

3️⃣ Launch the App

streamlit run route.py

4️⃣ Start Analyzing!

  • 🔑 Add API keys to .env
  • 📤 Upload .exe or .dll files
  • 💭 Explore AI reports, visualizations, and comparisons
  • 🎉 Ask questions via the interactive chat

🎯 Perfect For

Use Case Example
🔍 Security Researchers "Is this file malicious based on Antivirus?"
💻 Developers "Compare the entropy of two executables"
🛡️ Analysts "Summarize the threat level of this file"
📋 Enthusiasts "What are the suspicious imports in this DLL?"

🛠️ How It Works

graph LR
    A[Upload File] --> B[Feature Extraction]
    B --> C[ML Prediction]
    B --> D[Antivirus Scan]
    C --> E[AI Narrative]
    D --> E
    E --> F[Store in JSON]
    E --> G[Visualizations]
    E --> H[File Comparison]
    F --> I[Chat Interface]
    I --> E
Loading
Step Description
📄 File Upload Upload .exe or .dll via Streamlit
🔪 Feature Extraction Extracts hashes, entropy, imports using pefile
🧮 ML Prediction Classifies files using .joblib models
🌐 Antivirus Scan Fetches antivirus results
🤖 AI Narrative Generates reports with gpt-4o-mini
💾 Storage Saves analyses in llm_analyses.json
📊 Visualizations Displays feature plots
🔍 File Comparison Compares multiple files
💬 Chat Interactive Q&A with context

🎛️ Features & Configuration

Feature Description
🔧 Customizable Settings AI Model: gpt-4o-mini for fast, accurate narratives, ML Models: Stored in models/ as .joblib files, Storage: llm_analyses.json for persistent analysis, APIs: OpenAI and Antivirus for enhanced insights

🎨 User Experience

Feature Description
Modular Interface Navigate between home, analysis, and comparison pages
Seamless Chat Real-time Q&A with context retention
Styled UI Clean, card-based design with emojis
Error Handling User-friendly error messages

📁 Project Structure

File Description
__pycache__/ Python cache files (ignored)
dataset/ Training/testing datasets
legitimate/ Legitimate files for analysis
models/ ML models (.joblib)
plots/ Generated visualizations
test_models/ Testing ML models
zip_file/ Zipped data files
app.py Alternative app entry point
data.csv Feature data or results
feature_explorer.py Feature visualization module
file_comparison.py File comparison module
home.py Landing page
llm.py AI analysis and chat
malware_detection.py ML detection logic
malware_prediction.ipynb Development notebook
modification.py File modification logic
requirements.txt Dependencies
route.py Main app with routing
temp.tex Temporary file (ignored)
test.ipynb Testing notebook
test.py Testing script
threat_summary.py Threat summary module
virustotal_analysis.py Antivirus integration
.env API keys (not tracked)
.gitignore Git ignore file

🐛 Troubleshooting

Problem Solution
🔑 API Key Issues Verify OPENAI_API_KEY and VT_API_KEY in .env, Check OpenAI or Antivirus, Regenerate keys if needed
📄 File Upload Problems Ensure file is a valid .exe or .dll, Check file size (under 10MB recommended), Verify pefile compatibility
💾 Storage Issues Check write permissions: chmod -R u+rw ., Ensure disk space is available, Delete and recreate llm_analyses.json
🐌 Slow Performance Clear browser cache, Restart Streamlit: streamlit run route.py, Test with smaller files, Check internet for API calls
🤖 Model Issues Verify .joblib files in models/, Update model path in route.py, Test with test.py or malware_prediction.ipynb

🚀 Pro Tips

Tip Description
💡 Better Questions, Better Answers Be specific, Use context, Follow up
🎯 Optimize Performance Test with small files first, Keep models/ and dataset/ organized, Clear llm_analyses.json for large datasets, Use test.ipynb for debugging

🔒 Privacy & Security

Safe Be Aware
✅ Files processed locally, Only chat queries sent to OpenAI, Analyses stored in llm_analyses.json, Temporary files (temp_llm.exe, temp.tex) auto-deleted ⚠️ Keep .env out of Git, Use non-sensitive files for testing, Clear llm_analyses.json for sensitive data

🆙 Future Enhancements

Feature Description
🎯 Coming Soon Support for additional file types, Enhanced visualization options, Batch file processing, Advanced comparison metrics, Custom ML model training

🤝 Contributing

Love this project? Here's how to help:

  • 🌟 Star this repository
  • 🐛 Report bugs or issues
  • 💡 Suggest new features
  • 🔧 Submit pull requests
  • 📢 Share with the security community

📞 Support

Need help? Try these:

  • 📚 Documentation: See troubleshooting above
  • 🐛 Bug Reports: Open a GitHub issue
  • 💬 Questions: Start a discussion
  • 📧 Direct Contact: [sultanrajwan@gmail.com]

📜 License

⭐ Star this repo • 🐛 Report Bug • 💡 Request Feature

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors