Agent-Assisted Vulnerability Triage

An LLM agent (Claude Haiku) achieves 92% precision@10 on vulnerability triage — outperforming CVSS (82%) but underperforming EPSS (100%). The key finding: agent + EPSS ensemble reaches 98%, capturing complementary signal that neither achieves alone.

Blog post: Can an AI Agent Triage Vulnerabilities Better Than EPSS?

Key Results

Method	Precision@10	vs CVSS
EPSS	100%	+18pp
Agent + EPSS ensemble	98%	+16pp
Agent (Claude Haiku)	92%	+10pp
Agent + exploit enrichment	94%	+12pp
CVSS ranking	82%	baseline
Random	14%	—

Quick Start

git clone https://github.com/rexcoleman/agent-vuln-triage
cd agent-vuln-triage
pip install -e .
bash reproduce.sh

Project Structure

FINDINGS.md # Research findings with pre-registered hypotheses and full results
EXPERIMENTAL_DESIGN.md # Pre-registered experimental design and methodology
HYPOTHESIS_REGISTRY.md # Hypothesis predictions, results, and verdicts
reproduce.sh # One-command reproduction of all experiments
governance.yaml # govML governance configuration
LICENSE # MIT License
pyproject.toml # Python project configuration
scripts/ # Experiment and analysis scripts
src/ # Source code
tests/ # Test suite
outputs/ # Experiment outputs and results
data/ # Data files and datasets
docs/ # Documentation and decision records

Methodology

See FINDINGS.md and EXPERIMENTAL_DESIGN.md for detailed methodology, pre-registered hypotheses, and full experimental results with multi-seed validation.

License

MIT 2026 Rex Coleman

Governed by govML v3.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent-Assisted Vulnerability Triage

Key Results

Quick Start

Project Structure

Methodology

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
blog		blog
docs		docs
outputs		outputs
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
DECISION_LOG.md		DECISION_LOG.md
EXPERIMENTAL_DESIGN.md		EXPERIMENTAL_DESIGN.md
FINDINGS.md		FINDINGS.md
HYPOTHESIS_REGISTRY.md		HYPOTHESIS_REGISTRY.md
LICENSE		LICENSE
README.md		README.md
governance.yaml		governance.yaml
pyproject.toml		pyproject.toml
reproduce.sh		reproduce.sh

Folders and files

Latest commit

History

Repository files navigation

Agent-Assisted Vulnerability Triage

Key Results

Quick Start

Project Structure

Methodology

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages