release: v0.10.0 - engine improvements by garagon · Pull Request #37 · garagon/aguara

garagon · 2026-03-24T14:57:41Z

Summary

8 engine improvements for evasion prevention, signal quality, and library consumer API. Derived from oktsec IPI Arena benchmark analysis (95 attacks, 14 missed by deterministic layer).

4 new decoders: URL encoding, Unicode escapes, HTML entities, hex escapes - catches encoded evasion attacks that bypassed all 177 rules
NLP for JSON/YAML: detects tool poisoning in MCP tool description fields
RiskScore (0-100): aggregate risk metric with diminishing returns on ScanResult
Proximity weighting: NLP classifier penalizes sparse keywords in long text, reducing FPs on documentation
Dynamic confidence (0.50-0.95): reflects signal strength instead of flat per-analyzer values
Configurable dedup: WithDeduplicateMode(DeduplicateSameRuleOnly) preserves cross-rule findings for library consumers
Cross-file toxicflow: TOXIC_CROSS_001/002/003 detect dangerous capability combos across files in same directory
Library-mode rug-pull: WithStateDir() enables hash-based change detection for library consumers

Validation

550 tests (was 454), all passing with -race, 0 lint issues
177/177 rule self-tests pass (TP/FP)
28,207 real MCP skills from Aguara Watch scanned with zero regressions
Normalized detection rate delta: -2.1% to +12.7% (within tolerance for 39 new rules)
Zero false positives on benign content (API docs, MCP READMEs, calculators, formatters)
Crypto address filter eliminates ETH address hex decoder FPs
Flat directory filter (>50 files) eliminates cross-file FPs on registries

API additions (non-breaking)

aguara.WithDeduplicateMode(aguara.DeduplicateSameRuleOnly)
aguara.WithStateDir("/path/to/state")
aguara.ScanResult.RiskScore // float64, 0-100

Test plan

make build && make test && make vet && make lint all pass
Rule self-tests (177 TP/FP) pass
Synthetic test files: encoded evasions, JSON/YAML poisoning, cross-file, benign content
Production validation: 28,207 Watch skills across 5 registries
Regression check: lobehub 48=48, mcp-registry 0=0
Confidence distribution verified: all low-conf findings in code blocks

…nal quality 8 engine improvements derived from oktsec IPI Arena benchmark analysis. Validated against 28,207 real MCP skills from Aguara Watch with zero regressions. 550 tests, 0 lint issues. New features: - Additional decoders: URL encoding, Unicode escapes, HTML entities, hex escapes - NLP analysis for JSON/YAML tool descriptions (MCP tool poisoning detection) - Aggregate RiskScore (0-100) on ScanResult with diminishing returns - Proximity weighting in NLP classifier (FP reduction on long docs) - Dynamic confidence scores (0.50-0.95 based on signal quality) - Configurable cross-rule dedup (DeduplicateSameRuleOnly for library consumers) - Cross-file toxicflow correlation (TOXIC_CROSS_001/002/003) - Library-mode rug-pull state API (WithStateDir option) API additions (non-breaking): - aguara.WithDeduplicateMode() - aguara.WithStateDir() - aguara.ScanResult.RiskScore - aguara.DeduplicateMode type

garagon merged commit aa45255 into main Mar 24, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release: v0.10.0 - engine improvements#37

release: v0.10.0 - engine improvements#37
garagon merged 1 commit intomainfrom
release/v0.10.0

garagon commented Mar 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garagon commented Mar 24, 2026

Summary

Validation

API additions (non-breaking)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant