An efficient, explainable detection system for Living Off The Land (LOTL) attacks that achieves high performance while being dramatically faster and cheaper than LLM-based solutions.
This project implements a multi-model ensemble approach to detect LOTL attacks in Windows Sysmon events. The system combines:
- Random Forest with comprehensive feature engineering (Only one that used on default)
- Small Neural Network for deep pattern recognition
- LLM Reasoning Distillation for explainable predictions
The detector achieves ≥98% precision and ≥99% recall while running ~50x faster and being 1000x+ cheaper than Claude-Sonnet-4.5.
- 🎯 High Performance: Achieves 98%+ precision and 98%+ recall
- ⚡ Fast Inference: ~50x faster than LLM baseline
- 💰 Cost Effective: 1000x+ cheaper than Claude-Sonnet-4.5
- 🔍 Explainable: Provides human-readable explanations for each prediction
- 🧩 Modular Design: Clean separation of components
- Python 3.8+
- uv package manager
# Clone the repository
git clone https://github.com/YanivZimmer/lolt_detection
cd lolt_detection
# Create virtual environment and install dependencies
make setup
source .venv/bin/activateOr manually:
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e .make trainThis will:
- Load the dataset from
data.jsonl - Filter events where Claude and ground truth labels agree
- Extract comprehensive features (including Claude reasoning insights)
- Train models using 5-fold cross-validation for robust evaluation
- Train Random Forest and Neural Network models
- Optionally train Disagreement Detector (V2 model)
- Save models to
models/directory
Options:
--use-augmentation: Enable data augmentation for training--use-disagreement-detector: Train V2 model to detect label disagreements--train-final-model: Train final model on all data after k-fold evaluation--n-splits: Number of folds (default: 5)
Note: For Neural Network training with GPU, use the Colab notebook:
- Upload
train_neural_network.ipynbto Google Colab - Upload
data.jsonland required Python files - The notebook uses the same k-fold splits as the main training (reproducible)
- Run the notebook to train and download the model
- Place the downloaded model in
models/directory
Note: For LLM Distillation training:
- Upload
train_llm_distillation.ipynbto Google Colab - This trains a lightweight LLM to distill Claude's reasoning
- Download and place in
models/llm_distillation/directory
make serveThis starts a Chainlit web interface where you can:
- Paste Sysmon events as JSON
- Get predictions with explanations
- Test the detector interactively
make evaluateThis runs evaluation on the test set and saves results to models/evaluation_results.json.
lolt_detector/
├── data_loader.py # Dataset loading and preprocessing
├── feature_extractor.py # Comprehensive feature extraction
├── survivalism_features.py # Survivalism/LOTL behavioral features
├── obfuscation_detector.py # Obfuscation detection features
├── llm_distiller.py # LLM reasoning distillation
├── models.py # Random Forest and Neural Network models
├── ensemble.py # Ensemble model and explainer
├── train.py # Main training script
├── app.py # Chainlit interactive demo
├── train_neural_network.ipynb # Colab notebook for NN training
├── Makefile # Build automation
├── pyproject.toml # Dependencies and project config
└── README.md # This file
python lotl_detector/train.py --dataset data.jsonl --output-dir modelsOptions:
--dataset: Path to dataset file (default:data.jsonl)--output-dir: Directory to save models (default:models)--n-splits: Number of folds for cross-validation (default: 5)--random-seed: Random seed for reproducibility (default: 42)--use-rf: Use Random Forest (default: True)--use-nn: Use Neural Network (default: True)--use-augmentation: Enable data augmentation (default: False)--use-disagreement-detector: Train disagreement detector (default: False) for analysis--train-final-model: Train final model on all data (default: False)
from ensemble import LOTLEnsemble
from data_loader import sanitize_event_for_inference
# Load model
ensemble = LOTLEnsemble()
ensemble.load('models')
# Prepare event (remove metadata)
event = sanitize_event_for_inference(your_event_dict)
# Get prediction with explanation
results = ensemble.predict_with_explanation([event])
prediction = results[0]['prediction']
explanation = results[0]['explanation']
confidence = results[0]['confidence']The system extracts comprehensive features from Sysmon events:
- Survivalism Features: Native binary abuse, APT patterns, behavioral anomalies
- Obfuscation Features: Encoding patterns, entropy, command complexity
- Sysmon Features: Event IDs, integrity levels, user context, network activity
- Command-Line Features: Path analysis, pattern detection, operation types
- Text Embeddings: Lightweight sentence transformer embeddings (all-MiniLM-L6-v2)
- Claude Reasoning Insights: Features derived from Claude's reasoning patterns:
- Explorer launched by userinit.exe (lateral movement indicator)
- System accounts using admin tools
- Suspicious path operations
- System file modifications
- Compression/archiving operations (data staging)
- Random Forest: 100 trees with balanced class weights (The only one that used on default)
- Neural Network: 2-layer MLP (32→16→2) with dropout
- Ensemble: Weighted voting based on prediction confidence
- Disagreement Detector (V2): Optional model to detect cases where Claude and ground truth disagree
- K-Fold Cross-Validation: All models are evaluated using 5-fold cross-validation for robust performance estimation
- Reproducible Splits: Same random seed ensures consistent folds across training runs
- Same Folds: Neural network notebook uses identical k-fold splits as main training
Target metrics (on test set):
- Precision: ≥98%
- Recall: ≥99%
- Latency: <10ms per event (CPU)
- Cost: 1000x+ cheaper than Claude-Sonnet-4.5
Each line in data.jsonl is a JSON object:
{
"EventID": 1,
"CommandLine": "cmd.exe /c dir C:\\Users",
"Image": "C:\\Windows\\System32\\cmd.exe",
"User": "CORP\\jsmith",
"IntegrityLevel": "Medium",
"_label": "benign",
"claude-sonnet-4-5": {
"predicted_label": "benign",
"reason": "...",
"confidence": "high"
}
}Important: The model only uses production-available fields to avoide leakage (excludes _label, claude-sonnet-4-5, prompt, etc.)
- Dataset Size: Currently trained on ~250 events (small but sufficient)
- Windows Focus: Designed for Windows Sysmon events
- Feature Engineering: Relies on hand-crafted features (though comprehensive)
- Explainability: Explanations are heuristic-based, not from actual LLM distillation
- Scale Dataset: Collect more diverse LOTL attack examples
- Deep LLM Distillation: Actually train a small LLM on Claude's reasoning
- Temporal Features: Add sequence-based features for multi-event analysis
- Active Learning: Implement hard-negative mining
- Real-time Monitoring: Integrate with SIEM systems
This is a research project. For questions or improvements, please open an issue.
MIT standard licence
- Based on research from "LOLWTC: A Deep Learning Approach for Detecting Living Off the Land Attacks"
- Survivalism features inspired by Barr-Smith et al. 2021
- Uses sentence-transformers for lightweight text embeddings