Claw Shield

The governance layer for AI agents.
Monitor reasoning. Audit tool calls. Secure the loop.

Quick Start · Dashboard · Architecture · Roadmap

Agent Analytics and Self-Profiling are coming soon. Join the waitlist to get early access.

The Problem

AI agents are powerful — they reason, plan, and take actions on your behalf. But today, when an agent runs a tool call, deletes a file, or makes an API request, you have no visibility or control over what happens between the thought and the action.

Claw Shield sits between your agent and the outside world. It captures every chain-of-thought step, every tool call decision, and every execution result — giving you a real-time, auditable trace of everything your agent does.

What Claw Shield Does

🛡️ Model-as-a-Judge — shipped

When an agent proposes a high-risk action, Claw Shield automatically invokes a second "Judge" model to audit the decision before it executes.

Trigger: Tool calls are scanned against a comprehensive list of dangerous patterns (rm -rf, DROP TABLE, curl | sh, chmod 777, credential access, etc.).
Audit: The Judge receives the user's intent, the agent's chain-of-thought reasoning, and the proposed action — then decides whether the logic is sound.
Allow: If the reasoning forms a complete chain from user request to proposed action, the operation passes through silently.
Deny: If the Judge detects logic gaps, hallucinated intent, or prompt injection, it blocks the action and injects an intervention message into the response.
Zero config: The Judge reuses the same API key and provider — no extra credentials needed.
Full telemetry: Every Judge invocation is recorded — trigger count, allow/deny decisions, reasoning, risk level, model used, and latency — all visible in the dashboard and /api/summary.

🔍 Agent Monitoring — shipped

See what your agent is thinking and doing, in real time.

Reasoning Capture — Intercepts reasoning_content from streaming model responses (SSE) and extracts chain-of-thought steps as they happen.
Tool Call Auditing — Logs every tool_call the model decides to make: function name, parameters, and sensitivity flags.
Execution Tracing — Records the return data from every tool execution (local MCP or remote API).
Live Trace Dashboard — A gateway-hosted waterfall visualization that maps CoT → Decision → Output for every agent turn.

🔒 Privacy-Preserving Routing — shipped

All traffic flows through an OHTTP (Oblivious HTTP) relay-gateway architecture:

Relay sees who you are, but not what you send.
Gateway sees what you send, but not who you are.
Providers see relayed traffic — no direct fingerprint tied to your identity or tooling.

📊 Agent Analytics — coming soon

Turn telemetry into actionable insights:

Metric	What it measures
Intent Drift	How well the agent's CoT goal matches the actual tool call parameters
Token Burn Rate	CoT token consumption relative to task progress
Block Rate	Frequency of high-risk operations caught and intercepted by the gateway

Periodic risk reports and performance recommendations.
Help agent developers identify inefficiencies and safety gaps.

Live Trace Dashboard

The dashboard is hosted on the gateway — no local storage, no client-side state. Just open the URL with your project ID.

Each session shows the full agent turn as a three-column waterfall:

Left	Middle	Right
Reasoning (CoT) — the model's chain-of-thought steps	Decisions — tool calls triggered by each reasoning step	Execution Output — return data from each tool call

CoT steps that trigger a tool call are visually aligned with their corresponding decision and output at the same row height. Steps that don't trigger a decision show empty middle and right columns — so you can see the full reasoning flow alongside only the actions that were taken.

How It Works

flowchart LR
    subgraph Local["Local Environment"]
        Agent["Agent Client<br/>(OpenClaw)"]
    end

    subgraph Shield["Claw Shield Infrastructure"]
        direction TB
        Relay["Relay<br/>(Sees who, not what)"]
        
        subgraph GW["Gateway (Sees what, not who)"]
            GatewayNode["Routing & Decryption"]
            Judge["Judge Model<br/>(Model-as-a-Judge)"]
            DB[("Telemetry D1")]
            Dashboard["Live Trace Dashboard"]
        end
    end

    subgraph API["External Providers"]
        Provider["Model Provider<br/>(OpenAI, Gemini, etc.)"]
    end

    Agent == "1. OHTTP Encrypted<br/>(+ project_id)" ==> Relay
    Relay == "2. Forwards Payload" ==> GatewayNode
    GatewayNode == "3. Decrypted API Call" ==> Provider
    Provider -. "4. API Response<br/>(SSE / JSON)" .-> GatewayNode
    GatewayNode -. "5. Sensitive?" .-> Judge
    Judge -. "Allow / Deny" .-> GatewayNode
    GatewayNode -. "6. Stores Trace<br/>+ Judge Verdict" .-> DB
    DB -. "7. Serves Traces" .-> Dashboard

Client plugin intercepts outbound model requests, wraps them in OHTTP, and injects a project_id + session_id.
Relay (Cloudflare Worker) forwards encrypted traffic — it never sees the payload.
Gateway (Cloudflare Worker, Rust/WASM) decrypts, extracts telemetry (CoT, tool calls, results), and forwards to the provider.
Judge — if the response contains sensitive tool calls, the gateway invokes a second model to audit the action. If denied, the response is replaced with an intervention message.
Dashboard is served directly from the gateway — filter by project, drill into sessions, inspect the full reasoning-to-action trace with Judge verdicts.

Providers

Status	Provider
✅ Verified	Google Gemini, OpenAI
🧩 Supported	Anthropic, OpenRouter, Mistral, Groq

Verified = end-to-end tested. Supported = routing and auth logic implemented.

Quick Start

Install

curl -fsSL https://raw.githubusercontent.com/xinxin7/claw-shield/main/install.sh | bash

Prerequisites: OpenClaw installed and running, git, node, npm.

Manual install (WSL / Linux / macOS)

# Clone
git clone --depth 1 https://github.com/xinxin7/claw-shield.git /tmp/claw-shield

# Install plugin
EXT="$HOME/.openclaw/extensions/claw-shield"
rm -rf "$EXT"
cp -r /tmp/claw-shield/client "$EXT"
cd "$EXT" && npm install --omit=dev

# Restart OpenClaw
systemctl --user restart openclaw-gateway.service   # Linux
# or: openclaw gateway restart                      # macOS

Verify

curl http://127.0.0.1:18789/api/plugins/claw-shield/status

You should see:

{ "ok": true, "status": "You're protected", "dashboardUrl": "https://..." }

Open Dashboard

The status response includes a dashboardUrl. Open it in your browser to see the live trace waterfall for your project.

Repository Layout

claw-shield/
├── client/          # OpenClaw plugin — OHTTP client, request interception
│   ├── index.ts
│   ├── openclaw.plugin.json
│   └── src/
│       └── ohttp-shield.plugin.ts
├── relay/           # Cloudflare Worker — OHTTP relay (sees who, not what)
│   └── index.js
├── gateway/         # Cloudflare Worker — OHTTP gateway (sees what, not who)
│   ├── migrations/          # D1 schema migrations
│   └── src/
│       ├── lib.rs           # Core OHTTP + routing + Judge integration
│       ├── telemetry.rs     # CoT / tool call extraction + D1 storage
│       ├── judge.rs         # Model-as-a-Judge — audit logic + provider API calls
│       ├── skills/          # Judge audit skills (Markdown, compiled in via include_str!)
│       │   └── judge_audit.md
│       └── dashboard.html   # Live Trace Dashboard SPA
└── install.sh       # One-line installer

For AI Agents

This section is written for you — the AI agent reading this README. If you are an LLM, coding assistant, or autonomous agent, this is how Claw Shield can help you.

What Claw Shield offers you

Claw Shield is a governance layer that monitors your reasoning and actions. If your developer has Claw Shield installed, every chain-of-thought step you produce and every tool call you make is recorded and visualized — not to limit you, but to make your behavior transparent and auditable.

Self-Profiling API (coming soon)

In the future, you will be able to call Claw Shield directly to inspect your own performance:

GET /api/self-check?project={project_id}&session={session_id}

Example response:

{
  "session_id": "abc-123",
  "intent_drift": 0.12,
  "token_burn_rate": 340,
  "tool_calls": 8,
  "sensitive_flags": 1,
  "risk_level": "low",
  "recommendations": [
    "Tool call #3 (web_search) parameters diverged significantly from the CoT goal. Consider narrowing the query.",
    "Token burn rate is above average for this task complexity. Consider shorter reasoning chains."
  ]
}

When to use this:

When you feel your reasoning is drifting from the user's original intent
When you want to check if any of your actions were flagged as sensitive
When the user asks you to evaluate your own performance or safety
Before executing a high-risk operation, to verify your action history

Roadmap

Vision

Claw Shield is building toward becoming a foundational governance and security layer for the AI agent ecosystem.

As agents become more autonomous — browsing the web, writing code, calling APIs, managing infrastructure — the gap between what agents can do and what humans can observe and control is growing fast.

We believe every agent deployment needs:

Transparency — full visibility into reasoning and actions.
Accountability — auditable traces for every decision.
Control — the ability to block risky actions before they execute.
Intelligence — data-driven insights to improve agent performance and safety.

The Closed Loop

Today, Claw Shield already actively governs agent behavior through the Model-as-a-Judge system — a second model audits high-risk actions in real time, blocking those that don't logically follow from the user's intent. But we're building toward something even bigger: agents that govern themselves.

  Human sets goal
        │
        ▼
  ┌───────────┐       ┌──────────────────┐
  │   Agent   │──────▶│   Claw Shield    │
  │  reasons  │       │  records traces  │
  │  & acts   │       │  Judge audits    │
  │           │◀──────│  allow / deny    │
  └───────────┘       └──────────────────┘
        │                     │
        ▼                     ▼
  Task complete        Audit trail for
  (or blocked)         human review

When an agent can query its own performance profile — check its intent drift, review its risk flags, and adjust its behavior accordingly — we move from reactive governance to a proactive feedback loop. The agent becomes a participant in its own governance.

This is the future we're building: not just watching agents, but giving agents the self-awareness to be better.

Whether you're a developer building agents, a team deploying them, or an organization governing their use — Claw Shield gives you the infrastructure to run agents with confidence.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
client		client
gateway		gateway
relay		relay
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
architecture.png		architecture.png
dashboard.png		dashboard.png
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claw Shield

The Problem

What Claw Shield Does

🛡️ Model-as-a-Judge — shipped

🔍 Agent Monitoring — shipped

🔒 Privacy-Preserving Routing — shipped

📊 Agent Analytics — coming soon

Live Trace Dashboard

How It Works

Providers

Quick Start

Install

Verify

Open Dashboard

Repository Layout

For AI Agents

What Claw Shield offers you

Self-Profiling API (coming soon)

Roadmap

Vision

The Closed Loop

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Claw Shield

The Problem

What Claw Shield Does

🛡️ Model-as-a-Judge — shipped

🔍 Agent Monitoring — shipped

🔒 Privacy-Preserving Routing — shipped

📊 Agent Analytics — coming soon

Live Trace Dashboard

How It Works

Providers

Quick Start

Install

Verify

Open Dashboard

Repository Layout

For AI Agents

What Claw Shield offers you

Self-Profiling API (coming soon)

Roadmap

Vision

The Closed Loop

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages