Skip to content

Siddhi259/ThreatSense

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘ ThreatSense

Emotion-Aware Social Engineering Attack Detection System

Detecting cyber threats through psychological manipulation patterns in digital messages

Python scikit-learn Streamlit Accuracy Tests License

Final Year Project | Computer Science & Engineering | 2025–2026


πŸ“Œ What is ThreatSense?

ThreatSense detects social engineering cyber attacks β€” phishing emails, SMS scams, fraudulent messages β€” by analyzing psychological manipulation patterns in text. Unlike traditional tools that rely on URL blacklists or malware signatures, ThreatSense detects the human psychology being exploited: fear, urgency, authority, greed, and trust.

Zero-day safe β€” no signatures needed. If a message manipulates emotions, ThreatSense catches it.


🎯 Key Features

Feature Description
🧠 5-Category Emotion Engine Detects Fear, Urgency, Authority, Greed, Trust
πŸ“ Weighted Risk Formula (fearΓ—3) + (urgencyΓ—2) + (authorityΓ—2) + (greedΓ—1) + (trustΓ—1)
πŸ€– ML Ensemble Random Forest + Logistic Regression β€” 93.3% accuracy
πŸ”€ Pure Python NLP Stopword removal + stemming, zero external dependencies
🌐 Web Dashboard 4-tab Streamlit interface: Scan, Batch, History, Stats
πŸ’» CLI Tool 8 operating modes
πŸ“‚ Batch Analyzer Process CSV files of messages
πŸ“œ HTML Reports Self-contained downloadable analysis reports
πŸ—‚ Analysis Logger Persistent JSON history of all scans

πŸ“ Project Structure

threatsense/
β”œβ”€β”€ main.py                    ← CLI entry point (8 modes)
β”œβ”€β”€ app.py                     ← Streamlit web UI (4 tabs)
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .gitignore
β”‚
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ preprocessor.py        ← Lowercase, remove symbols, tokenize
β”‚   β”œβ”€β”€ nlp_enhancer.py        ← Stopword removal + suffix stemming
β”‚   β”œβ”€β”€ emotion_detector.py    ← 5-category keyword emotion engine
β”‚   β”œβ”€β”€ risk_scorer.py         ← Weighted risk score formula
β”‚   β”œβ”€β”€ classifier.py          ← HIGH / MEDIUM / LOW + recommendation
β”‚   β”œβ”€β”€ feature_extractor.py   ← 13-feature vector for ML
β”‚   β”œβ”€β”€ ml_model.py            ← RF + LR ensemble (93.3% accuracy)
β”‚   β”œβ”€β”€ batch_analyzer.py      ← CSV batch processing
β”‚   β”œβ”€β”€ logger.py              ← Persistent JSON analysis log
β”‚   └── report_generator.py   ← HTML report export
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ keywords.json          ← Emotion keyword dictionary + weights
β”‚   β”œβ”€β”€ training_data.json     ← 60 labeled training samples
β”‚   └── test_cases.csv         ← 10 labeled test messages
β”‚
β”œβ”€β”€ tests/
β”‚   └── test_detector.py       ← 15 unit tests
β”‚
└── docs/
    β”œβ”€β”€ patent_abstract.md     ← Patent draft with 5 claims
    └── viva_qa.md             ← 25 viva Q&As across 8 sections

πŸš€ Quick Start

# 1. Clone the repo
git clone https://github.com/YOUR-USERNAME/threatsense.git
cd threatsense

# 2. Install dependencies
pip install -r requirements.txt

# 3. Train the ML model (first time only)
python main.py --train

# 4. Run demo
python main.py --demo

# 5. Launch web UI
streamlit run app.py

🧠 How ThreatSense Works

User Message
     ↓
Preprocessor         β†’ lowercase, remove symbols, tokenize
     ↓
NLP Enhancer         β†’ stopword removal + suffix stemming
     ↓
Emotion Detection    β†’ fear / urgency / authority / greed / trust
     ↓
Risk Scoring         β†’ (fearΓ—3) + (urgencyΓ—2) + (authorityΓ—2) + (greedΓ—1) + (trustΓ—1)
     ↓
Classification       β†’ β‰₯7 HIGH  |  4–6 MEDIUM  |  ≀3 LOW
     ↓
ML Second Opinion    β†’ RF + LR ensemble (93.3% accuracy)
     ↓
Alert + Explanation + Recommendation

πŸ’» CLI Commands

python main.py                          # Interactive mode
python main.py --demo                   # 5 built-in sample messages
python main.py --train                  # Train ML model
python main.py --evaluate               # ML accuracy report
python main.py --batch data/test_cases.csv   # Batch CSV analysis
python main.py --stats                  # Aggregate stats from log
python main.py --history 20             # Last 20 scans
python main.py --report "Your message"  # Analyze + save HTML report

🌐 Web UI Tabs

Tab Feature
πŸ” Scan Single message analyzer with emotion badges and score chart
πŸ“‚ Batch Upload CSV β†’ analyze all messages with risk summary
πŸ“œ History All past scans with timestamps and risk levels
πŸ“Š Stats Charts: HIGH/MEDIUM/LOW breakdown, top detected emotions

πŸ“Š Results

Metric Value
Unit tests βœ… 15 / 15 passing
ML test-set accuracy 93.3%
ML cross-validation 88.3% Β± 6.7%
HIGH risk precision 1.00 (perfect)
HIGH risk recall 1.00 (perfect)
Batch test accuracy 90% (9/10)

πŸ§ͺ Sample Output

πŸ›‘ ThreatSense

Message: Your bank account is blocked. Act immediately or face legal action.

πŸ”΄ RISK LEVEL : HIGH
πŸ“Š RISK SCORE : 13

🧠 Emotion Analysis:
   βœ” Fear       | matched: ['blocked', 'legal']
   βœ” Urgency    | matched: ['immediately']
   βœ” Authority  | matched: ['bank']
   βœ— Greed      | not detected

πŸ’‘ Detected psychological manipulation via: Fear, Urgency, Authority.
πŸ›‘  DO NOT click any links or share personal information.
πŸ€– ML Second Opinion : HIGH  (86% confidence)  [βœ” agrees]

πŸ“œ Patent

Title: ThreatSense: Emotion-Based Social Engineering Attack Detection System and Method

Domain: Cybersecurity + Behavioral Analysis + NLP

5 Key Claims:

  1. Emotion-category keyword matching with asymmetric weighted scoring
  2. Risk formula: (fearΓ—3) + (urgencyΓ—2) + (authorityΓ—2) + (greedΓ—1) + (trustΓ—1)
  3. Zero-signature detection β€” no URL blacklists, no file scanning needed
  4. Explainable natural-language risk output
  5. Cross-platform (email, SMS, chat, voice-to-text)

See docs/patent_abstract.md


πŸŽ“ Viva Prep

docs/viva_qa.md β€” 25 Q&As across 8 sections covering architecture, NLP, ML, testing, future scope, and patent claims.


πŸ›  Tech Stack

Layer Technology
Language Python 3.x
ML scikit-learn (Random Forest, Logistic Regression)
NLP Pure Python β€” no NLTK or spaCy needed
UI Streamlit
Storage JSON / CSV / Pickle
Testing pytest
Reports Self-contained HTML

πŸ“„ License

MIT License β€” see LICENSE


πŸ›‘ ThreatSense Β Β·Β  Final Year Project Β Β·Β  CSE 2025–2026
Built by Siddhi Dhus

About

πŸ›‘ ThreatSense β€” Detects social engineering cyber attacks by analyzing emotional manipulation patterns (fear, urgency, authority, greed) in messages. Rule-based scoring + ML ensemble (93.3% accuracy). Built with Python, scikit-learn & Streamlit. Third Year MiniProject β€” CSE 2025–26.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages