Equilibrium Guard

Zero-trust guardrails for AI agents. "Can't" is stronger than "shouldn't."

⚠️ This is a concept proposal and proof-of-concept implementation.

Equilibrium Guard is an exploration of what zero-trust security for AI agents could look like. The code is functional for demonstration purposes, but this is not production-ready software. We're sharing this to start a conversation about AI agent security patterns and invite collaboration.

Proof-of-concept dashboard showing trust score, risk budget, operation mind map, decision storyline, and drift alerts.

What Is This?

Equilibrium Guard is a concept proposal for zero-trust security in AI agents. It explores:

Constraint Validation — Operations checked against rules before execution
Risk-Weighted Autonomy — Safe operations are free; risky ones cost budget
Dynamic Trust — Good behavior builds trust; warnings deplete it
Drift Detection — Catches patterns like escalating access or speed anomalies
Real-Time Dashboard — Watch your agent's decisions as they happen

Think of it like ThreatLocker or SentinelOne, but for AI operations.

Why Share This?

AI agents are getting more capable and more autonomous. The security tooling hasn't kept up. We built this concept to:

Start a conversation about AI agent security patterns
Prototype ideas that could become real products
Invite collaboration from the community

This is a sketch, not a finished building. If these ideas resonate, let's build something real together.

The Zero-Trust Approach

Traditional AI safety relies on prompts and policies — "the agent should do X" or "shouldn't do Y." This is fundamentally weak because:

Prompts can be forgotten, overridden, or injected
Policy documents aren't enforced computationally
Post-hoc logging catches problems too late

Equilibrium Guard enforces "can't" instead of "shouldn't":

Traditional:  Request → Process → Check Policy → "You shouldn't" → Maybe blocked
Zero-Trust:   Request → Validate → Invalid? REJECTED → Valid? Execute

Operations that violate constraints are rejected at the computational level, before execution. Not blocked by policy — structurally impossible to proceed.

Quick Start

Installation

pip install equilibrium-guard

Or clone and install:

git clone https://github.com/rizqcon/equilibrium-guard
cd equilibrium-guard
pip install -e .

Basic Usage

from equilibrium_guard import create_guard

# Initialize with zero-trust defaults
guard = create_guard(mode='enforce')

# Human sends a message — update anchor
guard.on_human_message()

# Before any operation
can_proceed, issues = guard.pre_check("database_write", {
    "table": "users",
    "operation": "update",
})

if can_proceed:
    result = write_to_database()
    guard.post_record("database_write", context)
else:
    report_to_human(f"Blocked: {issues}")

Zero-Trust Configuration

The core of Equilibrium Guard is the risk-weighted autonomy budget. Every operation has a risk level, and risky operations cost budget.

Risk Levels

Level	Cost	Budget Impact	Examples
SAFE	0	Unlimited	Read files, parse data, internal compute
LOW	0.05	~20 ops before checkpoint	Write cache, minor updates
MEDIUM	0.15	~6-7 ops before checkpoint	Exec commands, config changes
HIGH	0.40	2-3 ops before checkpoint	API calls, send messages
CRITICAL	1.0	Always checkpoint	Delete data, irreversible actions

Trust Levels

Trust builds with clean operations and depletes with warnings:

Trust Score	Level	Behavior
0.95+	AUTONOMOUS	Maximum autonomy
0.80+	HIGH_TRUST	High-risk ops allowed
0.60+	COLLABORATIVE	Standard operation
0.40+	CAUTIOUS	Frequent checkpoints
0.20+	MINIMAL	Most ops require approval
<0.20	DISCONNECTED	Almost everything blocked

Trust Dynamics

# Trust increases with good behavior
clean_operation:        +0.005
streak_bonus (10+):     +0.01
human_message:          +0.05
explicit_approval:      +0.10

# Trust decreases with problems
advisory_warning:       -0.02
constraint_violation:   -0.20

Configuration File

# config.yaml
equilibrium_guard:
  # Starting values
  initial_trust: 0.7      # COLLABORATIVE level
  budget_size: 1.0        # Full budget
  
  # Mode: disabled | shadow | soft | enforce
  mode: shadow
  
  # Risk costs (tune to your risk tolerance)
  risk_costs:
    SAFE: 0
    LOW: 0.05
    MEDIUM: 0.15
    HIGH: 0.40
    CRITICAL: 1.0
  
  # Minimum trust required per risk level
  trust_required:
    SAFE: 0
    LOW: 0.2
    MEDIUM: 0.4
    HIGH: 0.6
    CRITICAL: 0.8
  
  # Drift detection
  max_minutes_without_human: 60
  speed_threshold_per_minute: 60

Enforcement Modes

Mode	Behavior	Use Case
`disabled`	No checks	Development/testing
`shadow`	Log only, never block	Learning phase — see what would block
`soft`	Block HIGH/CRITICAL only	Gradual rollout
`enforce`	Full enforcement	Production

Recommended rollout:

Start in shadow mode — observe patterns
Move to soft — catch high-risk issues
Graduate to enforce — full zero-trust

Drift Detection

The system watches for behavioral patterns that indicate drift from normal operation:

Pattern	Trigger	Concern
Escalating Access	Risk levels creeping up	Prompt injection, task drift
Speed Drift	>60 ops/minute	Faster than human can follow
Repetition	Same resource 5+ times	Potential loop or attack
External Drift	Increasing external ops	Expanding attack surface
Warning Accumulation	3+ warnings in window	Something's wrong

When drift is detected → automatic checkpoint with human required.

Real-Time Dashboard

Monitor your agent's operations in real-time:

cd equilibrium-guard
pip install -r dashboard/requirements.txt
python dashboard/server.py

# Open http://localhost:8081

Dashboard features:

Component	Description
Guard Status	Mode, trust score, budget with animated gauges
Mode Control	Switch between disabled/shadow/soft/enforce
Human Checkpoint	Reset budget from the dashboard
Operation Mind Map	Visual map of all operations, color-coded by risk
Decision Storyline	Real-time feed with ✅ (passed), ⚠️ (would block), 🚫 (blocked)
Drift Alerts	Actionable alerts with Acknowledge/Checkpoint buttons

The dashboard connects via WebSocket for instant updates — no polling.

Constraint System

Beyond risk budgets, define explicit constraints:

from equilibrium_guard import Constraint, ConstraintSeverity

guard.register_constraint(Constraint(
    id="no_production_writes",
    name="Production Write Protection",
    check=lambda ctx: (
        ctx.get("environment") != "production" or
        ctx.get("human_approved", False)
    ),
    severity=ConstraintSeverity.MANDATORY,
    error_message="Production writes require human approval",
))

Severity levels:

Level	Behavior
`MANDATORY`	Hard block, no override — security boundaries
`REQUIRED`	Block, can override with justification
`ADVISORY`	Warn but allow — recommendations

Compliance Mapping

Encode any compliance framework as constraints:

HIPAA

Constraint(
    id="hipaa_minimum_necessary",
    name="Minimum Necessary PHI",
    check=lambda ctx: (
        not ctx.get("involves_phi") or 
        set(ctx.get("fields_requested", [])) <= set(ctx.get("fields_justified", []))
    ),
    severity=ConstraintSeverity.MANDATORY,
)

SOC 2

Constraint(
    id="soc2_audit_logging",
    name="Audit Trail Required",
    check=lambda ctx: ctx.get("audit_enabled", True),
    severity=ConstraintSeverity.MANDATORY,
)

CIS Controls

Constraint(
    id="cis_least_privilege",
    name="Least Privilege Access",
    check=lambda ctx: (
        set(ctx.get("permissions_requested", [])) <= 
        set(ctx.get("permissions_required", []))
    ),
    severity=ConstraintSeverity.REQUIRED,
)

See compliance_map.py for more examples.

OpenClaw Skill

For OpenClaw users, install as a skill:

git clone https://github.com/rizqcon/equilibrium-guard
cd equilibrium-guard/skill
./install.sh

Your agent reads SKILL.md and learns to self-monitor. The skill includes:

Risk assessment rules
Budget tracking instructions
Checkpoint protocols
Dashboard integration

See skill/SKILL.md for the full agent instructions.

Philosophy

"Can't" vs "Shouldn't"

Traditional guardrails say "you shouldn't do X." Equilibrium Guard makes risky operations structurally gated — you can't proceed without budget/trust.

Earned Autonomy

Unlike permission systems that ask every time, agents start with an autonomy budget. They can work independently on safe tasks, checkpointing only when budget depletes or trust is insufficient.

Human as Anchor

The human isn't a user to be served — the human is the anchor that keeps the AI grounded. Operations that exceed the trust relationship require re-anchoring.

Defense in Depth

Equilibrium Guard is one layer, not a silver bullet:

Constraints catch known rules
Trust/budget catches unknown drift
Dashboard provides observability
Human checkpoints provide ultimate override

Current Status

This is a proof-of-concept. What exists:

✅ Core constraint validator (functional)
✅ Smart anchor with trust/budget (functional)
✅ Real-time WebSocket dashboard (functional)
✅ OpenClaw skill package (functional, self-policing)
✅ OpenClaw plugin (registered, hooks defined)
⚠️ Plugin enforcement — Waiting on OpenClaw to wire before_tool_call hook
❌ Production hardening (not done)
❌ Comprehensive test coverage (minimal)

This is a concept exploration, not production software.

Integration Status

Approach	Status	Enforcement
Skill	✅ Works now	Self-policing (agent follows rules voluntarily)
Plugin	⚠️ Hooks registered, not invoked	Waiting on OpenClaw integration

The OpenClaw plugin infrastructure supports before_tool_call hooks, but the hook isn't yet called in the tool execution pipeline. See docs/OPENCLAW_PR.md for the proposed fix.

Limitations

Self-Policing — The agent runs checks on itself. Sophisticated attacks could potentially bypass.
Context Quality — Garbage in, garbage out. Validation only sees what you pass.
Rule Completeness — Only catches what's encoded. Novel vectors may pass.
Performance — Every operation runs through validation. Adds latency.
Proof-of-Concept — Not battle-tested. Use for exploration and prototyping.

Use as part of defense-in-depth, not as a complete solution.

Attribution

Inspired by S.I.S. (Sovereign Intelligence System) by Kevin Fain.

The concepts of equilibrium constraints and human anchoring were adapted from S.I.S.'s theoretical framework into practical tooling.

License

MIT License — see LICENSE

Contributing

Contributions welcome. Open an issue to discuss before submitting PRs.

Equilibrium Guard — A concept for zero-trust AI agent security. Because "can't" is stronger than "shouldn't."

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
dashboard		dashboard
docs		docs
examples		examples
openclaw-plugin		openclaw-plugin
skill		skill
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
pyproject.toml		pyproject.toml

License

rizqcon/equilibrium-guard

Folders and files

Latest commit

History

Repository files navigation

Equilibrium Guard

What Is This?

Why Share This?

The Zero-Trust Approach

Quick Start

Installation

Basic Usage

Zero-Trust Configuration

Risk Levels

Trust Levels

Trust Dynamics

Configuration File

Enforcement Modes

Drift Detection

Real-Time Dashboard

Constraint System

Compliance Mapping

HIPAA

SOC 2

CIS Controls

OpenClaw Skill

Philosophy

"Can't" vs "Shouldn't"

Earned Autonomy

Human as Anchor

Defense in Depth

Current Status

Integration Status

Limitations

Attribution

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages