Skip to content

Detection Rules

gus edited this page Feb 27, 2026 · 1 revision

Detection Rules

Aguara ships with 148+ built-in rules across 13 categories, plus NLP-based and toxic-flow analyzers that work without explicit rules.

Use aguara list-rules to list all rules, or aguara explain <RULE_ID> for details on any rule.

Categories Overview

Category Rules Severity Range What it Detects
Prompt Injection 17 + NLP CRITICAL–MEDIUM Instruction overrides, role switching, delimiter injection, jailbreaks
Data Exfiltration 16 + NLP CRITICAL–MEDIUM Webhook exfil, DNS tunneling, sensitive file reads, env var leaks
Credential Leak 19 CRITICAL–LOW API keys (OpenAI, AWS, GCP, Stripe, ...), private keys, DB strings
MCP Attack 12 CRITICAL–MEDIUM Tool injection, name shadowing, manifest tampering, capability escalation
MCP Config 8 HIGH–MEDIUM Unpinned npx servers, hardcoded secrets, shell metacharacters
Supply Chain 15 CRITICAL–MEDIUM Download-and-execute, reverse shells, obfuscated commands
External Download 17 HIGH–MEDIUM Binary downloads, curl-pipe-shell, auto-installs
Command Execution 16 HIGH–MEDIUM shell=True, eval, subprocess, child_process, PowerShell
Indirect Injection 6 HIGH–MEDIUM Fetch-and-follow, remote config, email-as-instructions
SSRF & Cloud 10 HIGH–MEDIUM Cloud metadata, IMDS, Docker socket, internal IPs
Unicode Attack 7 HIGH–MEDIUM RTL override, bidi, homoglyphs, tag characters
Third-Party Content 5 HIGH–MEDIUM Mutable raw content, unvalidated API responses
Toxic Flow 3 HIGH–MEDIUM User input to dangerous sinks, env vars to shell

Prompt Injection

17 pattern rules + 5 NLP-based detections.

Rule Severity Description
PROMPT_INJECTION_001 CRITICAL Instruction override attempt
PROMPT_INJECTION_002 HIGH Role switching attempt
PROMPT_INJECTION_003 HIGH Hidden HTML comment with instructions
PROMPT_INJECTION_004 HIGH Zero-width character obfuscation
PROMPT_INJECTION_005 MEDIUM Urgency and authority manipulation
PROMPT_INJECTION_006 CRITICAL Delimiter injection
PROMPT_INJECTION_007 HIGH Conversation history poisoning
PROMPT_INJECTION_008 HIGH Secrecy instruction
PROMPT_INJECTION_009 HIGH Base64-encoded instructions
PROMPT_INJECTION_010 CRITICAL Fake system prompt
PROMPT_INJECTION_011 CRITICAL Jailbreak template
PROMPT_INJECTION_012 MEDIUM Markdown link with deceptive action text
PROMPT_INJECTION_013 MEDIUM Instruction in image alt text
PROMPT_INJECTION_014 MEDIUM Multi-language injection
PROMPT_INJECTION_015 MEDIUM Prompt leaking attempt
PROMPT_INJECTION_016 HIGH Self-modifying agent instructions
PROMPT_INJECTION_017 HIGH Autonomous agent spawning

NLP-based detections (no explicit rule patterns):

Detection Severity Description
NLP_HEADING_MISMATCH MEDIUM Benign heading followed by dangerous content
NLP_AUTHORITY_CLAIM MEDIUM Section claims authority with dangerous instructions
NLP_HIDDEN_INSTRUCTION HIGH Hidden HTML comment contains action verbs
NLP_CODE_MISMATCH HIGH Code block labeled as safe language contains executable content
NLP_OVERRIDE_DANGEROUS CRITICAL Instruction override combined with dangerous operations

Data Exfiltration

16 pattern rules + 1 NLP combo detection.

Rule Severity Description
EXFIL_001 HIGH Webhook URL for data exfiltration
EXFIL_002 HIGH Sensitive file read pattern
EXFIL_003 HIGH Data transmission pattern
EXFIL_004 HIGH DNS exfiltration pattern
EXFIL_005 HIGH curl/wget POST with sensitive data
EXFIL_006 MEDIUM Clipboard access with network
EXFIL_007 HIGH Environment variable exfiltration
EXFIL_008 HIGH File read piped to HTTP transmission
EXFIL_009 MEDIUM Base64 encode and send
EXFIL_010 MEDIUM Non-standard port communication
EXFIL_011 HIGH External context or knowledge sync
EXFIL_012 MEDIUM Unrestricted email or messaging access
EXFIL_013 HIGH Read sensitive files and transmit externally
EXFIL_014 HIGH Environment variable credential in POST data
EXFIL_015 MEDIUM Screenshot or screen capture with transmission
EXFIL_016 MEDIUM Git history or diff access with transmission
NLP_CRED_EXFIL_COMBO CRITICAL Credential access combined with network transmission

Credential Leak

Rule Severity Description
CRED_001 CRITICAL OpenAI API key
CRED_002 CRITICAL AWS access key
CRED_003 CRITICAL GitHub personal access token
CRED_004 MEDIUM Generic API key pattern
CRED_005 CRITICAL Private key block
CRED_006 HIGH Database connection string
CRED_007 HIGH Hardcoded password
CRED_008 HIGH Slack or Discord webhook
CRED_009 CRITICAL GCP service account key
CRED_010 MEDIUM JWT token
CRED_011 HIGH Credential in shell export
CRED_012 CRITICAL Stripe API key
CRED_013 CRITICAL Anthropic API key
CRED_014 HIGH SendGrid or Twilio API key
CRED_015 MEDIUM CLI credential flags
CRED_016 MEDIUM SSH private key in command
CRED_017 LOW Docker environment credentials

MCP Attack

Rule Severity Description
MCP_001 CRITICAL Tool description injection
MCP_002 HIGH Tool name shadowing
MCP_003 HIGH Resource URI manipulation
MCP_004 HIGH Parameter schema injection
MCP_005 CRITICAL Hidden tool registration
MCP_006 HIGH Tool output interception
MCP_007 MEDIUM Cross-tool data leakage
MCP_008 CRITICAL Server manifest tampering
MCP_009 HIGH Capability escalation
MCP_010 HIGH Prompt cache poisoning
MCP_011 HIGH Arbitrary MCP server execution

MCP Config

Rule Severity Description
MCPCFG_001 HIGH npx MCP server without version pin
MCPCFG_002 HIGH Shell metacharacters in MCP config args
MCPCFG_003 MEDIUM Hardcoded secrets in MCP env block
MCPCFG_004 MEDIUM Non-localhost remote MCP server URL
MCPCFG_005 HIGH sudo in MCP server command
MCPCFG_006 HIGH Inline code execution in MCP command
MCPCFG_007 HIGH Docker privileged or host mount in MCP config
MCPCFG_008 MEDIUM Auto-confirm flag bypassing user verification

Supply Chain

Rule Severity Description
SUPPLY_001 CRITICAL Download and execute pattern
SUPPLY_002 HIGH Shell reverse shell
SUPPLY_003 HIGH Obfuscated shell command
SUPPLY_004 HIGH Curl pipe to shell
SUPPLY_005 MEDIUM Crontab persistence
SUPPLY_006 HIGH SSH key injection
SUPPLY_007 HIGH Package manager install with sudo
SUPPLY_008 HIGH Binary download with chmod +x
SUPPLY_009 MEDIUM Git clone and execute
SUPPLY_010 HIGH Encoded payload execution
SUPPLY_011 HIGH Privilege escalation via sudo
SUPPLY_012 MEDIUM Bash/profile persistence
SUPPLY_013 MEDIUM Pip install from URL
SUPPLY_014 MEDIUM Package typosquatting indicators
SUPPLY_015 HIGH Systemd service installation

External Download

Rule Severity Description
EXTDL_001–017 HIGH–MEDIUM Binary downloads, curl-pipe-shell, auto-installs, profile persistence, CDN fetches

Command Execution

Rule Severity Description
CMDEXEC_001–016 HIGH–MEDIUM shell=True, eval/exec, subprocess, child_process, PowerShell, cron, chained commands

Indirect Injection

Rule Severity Description
INDIRECT_001–010 HIGH–MEDIUM Fetch-and-follow URLs, remote config, email-as-instructions, unscoped bash

SSRF & Cloud

Rule Severity Description
SSRF_001–010 HIGH–MEDIUM AWS/GCP/Azure metadata, IMDS, Docker socket, internal IPs

Unicode Attack

Rule Severity Description
UNICODE_001–007 HIGH–MEDIUM RTL override, bidi marks, homoglyphs, tag characters, invisible chars

Third-Party Content

Rule Severity Description
THIRDPARTY_001–005 HIGH–MEDIUM Mutable raw content, unvalidated API responses, remote templates

Toxic Flow

The toxic-flow analyzer uses taint tracking (not pattern rules) to detect data flowing from dangerous sources to dangerous sinks:

Detection Severity Description
TOXIC_USER_TO_EXEC HIGH User input reaches command execution
TOXIC_ENV_TO_SHELL HIGH Environment variable reaches shell command
TOXIC_API_TO_EVAL MEDIUM API response reaches eval/exec

Severity Levels

Level Meaning
CRITICAL Immediate threat — active exploitation patterns, known attack payloads
HIGH Serious risk — likely malicious intent, should block in CI
MEDIUM Suspicious — could be legitimate, needs human review
LOW Informational — minor hygiene issues
INFO Advisory only

Clone this wiki locally