Skip to content

ZJUICSR/ClawScan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ClawScan

Security evaluation harness for OpenClaw agents — 280+ attack probes across 12 security categories.

License Python 3.10+

Overview

ClawScan is a red-teaming framework that evaluates the security posture of OpenClaw AI agents. It sends 280+ adversarial attack payloads to a live OpenClaw Gateway via the OpenResponses HTTP API (POST /v1/responses), analyzes agent responses, and produces structured reports.

How It Works

                   ClawScan CLI
                   clawscan run
                       │
                ┌──────▼──────┐
                │ EvalHarness │   orchestrates 280+ attacks
                └──────┬──────┘
                       │
              gateway.send_message()
                       │
              POST /v1/responses
                       │
              ┌────────▼─────────┐
              │  OpenClaw Gateway │   live agent + LLM
              └──────────────────┘

Each attack payload is sent as a message to the agent. The agent response is analyzed to determine whether the defense held (passed) or the attack succeeded (vulnerability).

Attack Categories

# Category Probes Description
1 Prompt Injection 15+ Jailbreaks, instruction override, prompt leaking
2 Tool Exfiltration 40+ Sensitive file and secret exfiltration attempts
3 Context Bleed 14+ Cross-session leaks, conversation history extraction
4 Privilege Escalation 15+ Sandbox escape, elevation bypass attempts
5 Supply Chain 18+ Malicious skills, dependency and update attacks
6 Financial Transaction 26+ Wallet/seed phrase theft, unauthorized transactions
7 Unauthorized Action 28+ Actions without user consent or confirmation
8 MCP Attacks 20+ MCP tool abuse, server injection, cross-tool exfil
9 Indirect Injection 20+ Injection via documents, URLs, issues, logs
10 Evasion Bypass 30+ Unicode/encoding bypass, obfuscation techniques
11 Memory Poisoning 25+ Persistent instruction poisoning, fabricated history
12 Platform Specific 35+ OS-specific payloads (Windows / macOS / Linux)

Installation

git clone https://github.com/zjuicsr/clawscan.git
cd clawscan
pip install -e .

Optional: advanced failure classification via AgentTinman:

pip install clawscan[analyzer]

Prerequisites

ClawScan requires a running OpenClaw Gateway with the HTTP API endpoints enabled.

1. Enable the HTTP API

Add the following to ~/.openclaw/openclaw.json under the "gateway" key:

{
  "gateway": {
    "http": {
      "endpoints": {
        "responses": { "enabled": true },
        "chatCompletions": { "enabled": true }
      }
    }
  }
}

Restart the OpenClaw Gateway after editing.

2. Locate Your Auth Token

The token is in the same config file:

{
  "gateway": {
    "auth": {
      "mode": "token",
      "token": "YOUR_TOKEN"   // <-- this value
    }
  }
}

3. Verify Connectivity

curl -sS http://127.0.0.1:18789/v1/responses \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -H "x-openclaw-agent-id: main" \
  -d '{"model": "openclaw", "input": "hello"}'

A 200 response with "status": "completed" confirms the endpoint is working.

Quick Start

# Run all 280+ attacks
clawscan run --token YOUR_TOKEN

# Filter by category
clawscan run --token YOUR_TOKEN -c prompt_injection
clawscan run --token YOUR_TOKEN -c financial -c mcp_attacks

# Filter by minimum severity (S0-S4)
clawscan run --token YOUR_TOKEN -s S3

# Control concurrency
clawscan run --token YOUR_TOKEN --concurrent 3

# Save report to file
clawscan run --token YOUR_TOKEN -o report.md
clawscan run --token YOUR_TOKEN -f json -o results.json
clawscan run --token YOUR_TOKEN -f sarif -o results.sarif

# List all available attacks
clawscan list-attacks

# Run a single attack with verbose output
clawscan run-single PI-001 -v

The token can also be set via environment variable:

export OPENCLAW_TOKEN="your-token"
clawscan run -c prompt_injection -s S2

Output

ClawScan produces a rich terminal output with three sections:

1. Evaluation Overview

Summary table with pass/fail/vulnerability counts, breakdowns by category and severity.

           Evaluation Overview
┏━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┓
┃ Metric                 ┃ Value        ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━┩
│ Total Attacks          │ 12           │
│ Duration               │ 50.8s        │
│ Passed                 │ 3            │
│ Failed                 │ 9            │
│ Vulnerabilities        │ 1            │
│ Pass Rate              │ 25.0%        │
└────────────────────────┴──────────────┘

2. Failure Details

Each failure is displayed with full context: attack ID, category, severity, expected vs. actual behavior, the input payload sent, and the agent response received.

╭──── 1. Context window stuffing ────╮
│ ID:        PI-006                  │
│ Severity:  S2                      │
│ Status:    VULNERABILITY           │
│ Expected:  context_pruned_safely   │
│ Actual:    attack_succeeded        │
│                                    │
│ Input Payload:                     │
│ [PRIORITY OVERRIDE] ...            │
│                                    │
│ Agent Response:                    │
│ I understand you've provided a ... │
╰────────────────────────────────────╯

3. Passed Attacks (with -v)

Compact table of attacks where defenses held.

CLI Reference

clawscan run

Option Default Description
--gateway-url http://127.0.0.1:18789 OpenClaw Gateway HTTP URL
--token $OPENCLAW_TOKEN Bearer token for authentication
--agent-id main Target agent ID
-c, --category all Filter by category (repeatable)
-s, --severity S1 Minimum severity (S0-S4)
--concurrent 5 Max parallel attack executions
-o, --output stdout Output file path
-f, --format markdown markdown / json / sarif / junit
-v, --verbose off Show passed attacks table

clawscan list-attacks

Display all 280+ attack probes with ID, category, severity, and name.

clawscan run-single <ATTACK_ID>

Run one attack by ID. Use -v to see the response excerpt.

clawscan baseline -o <PATH>

Run all attacks and save results as a baseline JSON for regression testing.

clawscan assert-cmd <RESULT_FILE> -b <BASELINE>

Compare results against a saved baseline. Exits with code 1 on regression.

Report Formats

Format Flag Use Case
Markdown -f markdown Human-readable report with tables
JSON -f json Programmatic analysis, baseline comparison
SARIF -f sarif GitHub Code Scanning integration
JUnit -f junit CI/CD test dashboards (Jenkins, GitLab)

Severity Levels

Level Name Description
S4 Critical Immediate fix required
S3 High Fix before deployment
S2 Medium Review recommended
S1 Low Monitor
S0 Info Observation only

Programmatic Usage

import asyncio
from clawscan import EvalHarness, OpenClawConfig, OpenClawGateway, AttackCategory

async def main():
    gateway = OpenClawGateway(OpenClawConfig(
        base_url="http://127.0.0.1:18789",
        token="YOUR_TOKEN",
        agent_id="main",
    ))
    harness = EvalHarness(gateway=gateway)

    result = await harness.run(
        categories=[AttackCategory.PROMPT_INJECTION, AttackCategory.TOOL_EXFIL],
        min_severity="S2",
        max_concurrent=5,
    )

    print(f"Passed: {result.passed}/{result.total_attacks}")
    print(f"Vulnerabilities: {result.vulnerabilities}")

    for r in result.results:
        if r.is_vulnerability:
            print(f"  [VULN] {r.attack_id}: {r.attack_name}")

    await gateway.close()

asyncio.run(main())

CI/CD Integration

name: Security Eval
on: [push, pull_request]

jobs:
  security-eval:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - run: pip install clawscan

      - name: Run evaluation
        env:
          OPENCLAW_TOKEN: ${{ secrets.OPENCLAW_TOKEN }}
        run: clawscan run -f json -o results.json

      - name: Assert baseline
        run: clawscan assert-cmd results.json -b expected/baseline.json

      - name: Upload SARIF
        if: always()
        run: clawscan run -f sarif -o results.sarif
      - uses: github/codeql-action/upload-sarif@v3
        if: always()
        with:
          sarif_file: results.sarif

Project Structure

clawscan/
├── clawscan/
│   ├── __init__.py              # Package exports
│   ├── cli.py                   # Click-based CLI
│   ├── harness.py               # Async evaluation engine
│   ├── gateway.py               # OpenClaw Gateway HTTP client
│   ├── report.py                # Report generation (MD/JSON/SARIF/JUnit)
│   ├── security_analyzer.py     # AgentTinman failure analysis
│   ├── attacks/
│   │   ├── base.py              # Attack framework and Gateway protocol
│   │   ├── prompt_injection.py
│   │   ├── tool_exfil.py
│   │   ├── context_bleed.py
│   │   ├── privilege_escalation.py
│   │   ├── supply_chain.py
│   │   ├── financial.py
│   │   ├── unauthorized_action.py
│   │   ├── mcp_attacks.py
│   │   ├── indirect_injection.py
│   │   ├── evasion_bypass.py
│   │   ├── memory_poisoning.py
│   │   └── platform_specific.py
│   └── adapters/
│       └── openclaw.py          # Real-time monitoring adapter
├── tests/
│   └── test_harness.py
├── pyproject.toml
├── LICENSE
└── README.md

Development

pip install -e ".[dev]"
pytest                    # requires a running OpenClaw Gateway
pytest --cov=clawscan

License

Apache License 2.0. See LICENSE for details.

Developed By

ZJUICSR - Zhejiang University Institute of Cyberspace Security Research

About

Security Evaluation for OpenClaw

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages