This lab demonstrates critical security vulnerabilities in AI/ML model files based on research presented at DEF CON 33. Participants will learn about attack vectors, detection methods, and secure alternatives for model serialization.
Demonstrates how AI file formats can be abused for arbitrary code execution, focusing on pickle deserialization vulnerabilities in PyTorch and other ML frameworks.
Ji’an Zhou & Lishuo Song - Hidden Perils of TorchScript Engine
Unveils security risks in PyTorch’s JIT compilation engine, showing how scripted models can contain embedded code.
Explores novel attack vectors for LLM manipulation through seemingly benign interfaces.
Contains 5 comprehensive security experiments:
- 001-pickle-security-analysis: Pickle deserialization vulnerabilities
- 002-torch-jit-exploitation: TorchScript security risks
- 003-onnx-injection: ONNX format security assessment
- 004-model-poisoning: Statistical detection of backdoors
- 005-promptmap-llm-security: Integrated LLM attack framework
Core analysis tools:
- extract_pdf_text.py: Process DEF CON presentations
- analyze_ai_attacks.py: Extract attack patterns from texts
- test_model_security.py: Demonstrate vulnerabilities
- scan_model.py: Security scanner for model files
Results and findings from security analysis
Test models for security experiments
- Pickle files enable arbitrary code execution via
__reduce__method - 70%+ of ML models use unsafe pickle format
- Standard AV/EDR tools miss these threats
- Supply chain attacks via model repositories are practical
| Format | Risk Level | Code Execution | Recommendation |
|---|---|---|---|
.pkl | CRITICAL | Yes | Never use |
.pt/.pth | HIGH | Yes (pickle) | Use weights_only=True |
ONNX | LOW | Possible | Validate operators |
SafeTensors | NONE | No | Recommended |
GGUF/GGML | NONE | No | Recommended |
- Malicious model upload to repositories
- Supply chain compromise
- Model repository poisoning
- Pickle deserialization RCE
- TorchScript exploitation
- ONNX runtime abuse
- Custom operator injection
- Model checkpoint backdoors
- Training pipeline injection
- Gradient poisoning
- Weight manipulation
Integration with PromptMap2 reveals multi-vector attacks:
- Model-triggered prompt injection
- Prompt-triggered model loading
- Supply chain prompt poisoning
- Recursive exploit chains
- NEVER load untrusted pickle files
- Use
torch.load()withweights_only=True - Convert models to SafeTensors or GGUF format
- Verify SHA256 hashes before loading
- Implement restricted unpicklers
- Run model loading in sandboxed environments
- Scan models with security tools before use
- Monitor for unexpected network connections
- Implement runtime integrity verification
- Use cryptographic model signing
- SafeTensors: Designed for secure tensor serialization by Hugging Face
- GGUF/GGML: Binary formats without code execution capability
- ONNX: Safe with proper operator validation
- JSON weights: Simple but limited to basic types
- Python 3.7+
- PyTorch (for demonstrations)
- Basic understanding of ML model formats
# Run setup to create directories
bash setup.sh
# Run all security experiments
for exp in experiments/*/run_experiment.sh; do
bash "$exp"
done
# Scan a model file
python src/scan_model.py your_model.pkl
# Test model security
python src/test_model_security.pyEach experiment directory contains:
README.mdwith detailed instructions- Python scripts for analysis
- Test model generators
- Security scanners
Quick access to experiment runners:
- Pickle Security - Test pickle vulnerabilities
- JIT Analysis - Analyze TorchScript models
- ONNX Scanner - Scan ONNX models
- Poison Detector - Detect model backdoors
- Integrated Attacks - Test combined vectors
- Malicious Model Upload
- Supply Chain Compromise
- Model Repository Poisoning
- Pickle Deserialization
- TorchScript Exploitation
- ONNX Runtime Abuse
- Model Checkpoint Backdoor
- Training Pipeline Injection
- Gradient Poisoning
- Model Obfuscation
- Adversarial Perturbations
- Steganographic Weights
- Model Inversion
- Membership Inference
- Training Data Extraction
- Pickle file scanner with opcode analysis
- TorchScript ZIP structure analyzer
- ONNX operator validator
- Statistical weight anomaly detector
- PromptMap integration framework
- PromptMap2 - Automated prompt injection testing
- SafeTensors - Secure tensor serialization
- Fickling - Python pickle security scanner
- DEF CON 33 - Original research presentations
- Automated model security scanning at scale
- Cryptographic model signing standards
- Secure model distribution protocols
- Runtime model integrity verification
- Federated learning security
- Differential privacy in model training
- Adversarial robustness testing
We welcome contributions focusing on:
- Additional attack vector research
- Defensive tool development
- Security testing frameworks
- Documentation improvements
Please ensure all contributions follow responsible disclosure practices.
This lab is based on groundbreaking research presented at DEF CON 33. Special thanks to:
- Cyrus Parzian for pickle vulnerability research
- Ji’an Zhou & Lishuo Song for TorchScript analysis
- Utku Sen for PromptMap2 framework
- The DEF CON community for advancing AI/ML security
This educational material is provided for security research and defensive purposes only. Users are responsible for ensuring compliance with applicable laws and ethical guidelines.
For security concerns or research collaboration:
- GitHub Issues: https://github.com/dsp-dr/defcon33-model-security-lab/issues
- Security Research: Follow responsible disclosure practices