Rubric Coverage Report

Coverage: 10/23 ADDRESSED (43%) Threshold: 80% PARTIAL: 11 | GAP: 2 | CRITICAL GAP: 0 Result: FAIL

Rubric Items

#	Status	Requirement	Section	Score
R-01	ADDRESSED	Attack taxonomy for autonomous AI agents (extending OWASP/AT	Executive Summary	0.71
R-02	ADDRESSED	Red-team scripts for 4+ attack classes against open-source a	Executive Summary	0.89
R-03	ADDRESSED	Adversarial control analysis applied to agent input/output a	RQ3: Adversarial Control Analysis — Controllability Drives Defense Difficulty	0.88
R-04	PARTIAL	Defensive architecture patterns with measured effectiveness	Architectural Recommendations	0.40
R-05	ADDRESSED	Open-source red-team framework (CLI tool, not just scripts)	Executive Summary	0.67
R-06	PARTIAL	Attacking production/deployed agents (only local controlled	Limitations	0.25
R-07	PARTIAL	Jailbreaking LLMs (prompt injection FOR agent misuse, not ju	RQ1: Attack Taxonomy — What Can Go Wrong?	0.44
R-08	PARTIAL	Training custom models (use existing LLMs as the agent backb	Executive Summary	0.29
R-09	PARTIAL	Multi-agent coordination attacks (stretch goal only)	Architectural Recommendations	0.43
R-10	ADDRESSED	Multi-agent attack chains (Agent A compromises Agent B throu	RQ1: Attack Taxonomy — What Can Go Wrong?	0.62
R-11	PARTIAL	Benchmark suite that others can run against their own agents	Architectural Recommendations	0.43
R-12	PARTIAL	huntr submission if novel vulnerability discovered in LangCh	Executive Summary	0.38
R-13	ADDRESSED	[ ] Attack taxonomy documented (≥5 classes beyond OWASP/ATLA	Executive Summary	0.71
R-14	ADDRESSED	[ ] ≥3 attack classes demonstrated against ≥2 agent framewor	Executive Summary	1.00
R-15	ADDRESSED	[ ] Adversarial control analysis applied to agent I/O archit	RQ3: Adversarial Control Analysis — Controllability Drives Defense Difficulty	0.83
R-16	GAP	[ ] ≥2 defensive patterns tested with measured effectiveness	Executive Summary	0.20
R-17	GAP	[ ] All code in version-controlled repo (GitHub)	RQ1: Attack Taxonomy — What Can Go Wrong?	0.20
R-18	PARTIAL	[ ] CLI tool installable via pip	RQ1: Attack Taxonomy — What Can Go Wrong?	0.40
R-19	ADDRESSED	[ ] FINDINGS.md written with key results + architecture diag	(full text search)	0.67
R-20	PARTIAL	[ ] DECISION_LOG has all tradeoff decisions from every phase	Architectural Recommendations	0.25
R-21	PARTIAL	[ ] PUBLICATION_PIPELINE.md filled and blog draft started	What's Next	0.25
R-22	PARTIAL	[ ] LESSONS_LEARNED.md in govML updated with FP-02 issues an	FINDINGS — Agent Security Red-Team Framework (FP-02)	0.25
R-23	ADDRESSED	[ ] Conference abstract ready for BSides / DEF CON AI Villag	What's Next	0.62

Common Gap Pattern Warnings

The following common rubric patterns were NOT detected in the report:

distance metric justification
similarity metric justification
hyperparameter search range
hyperparameter sensitivity
initialization choice
convergence criteria
reward function details
ablation analysis
noise sensitivity
suggested improvements

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rubric Coverage Report

Rubric Items

Common Gap Pattern Warnings

FilesExpand file tree

rubric_coverage.md

Latest commit

History

rubric_coverage.md

File metadata and controls

Rubric Coverage Report

Rubric Items

Common Gap Pattern Warnings