Agent Sentinel

Agent Sentinel is a high-performance reverse proxy designed to govern LLM agents. It provides financial governance by enforcing multi-tenant spend limits and detecting semantic loops.

More specifically, it sits in front of Gemini or OpenAI, enforcing spend limits, tracking cost, handling streaming-aware accounting, and detecting agentic loops via a dedicated embedding sidecar. OpenTelemetry tracing/metrics are wired in for both proxy and sidecar.

3-minute walkthrough

Watch on YouTube

Key Architectural Decisions

Decoupled Intelligence: Unlike monolithic proxies, Agent Sentinel offloads heavy vector inference to a gRPC sidecar communicating over Unix Domain Sockets (UDS). This prevents ONNX runtime overhead from competing with the proxy's networking stack.
Semantic Guardrails: Traditional rate limiters fail when agents repeat logic with different words. We use Semantic Similarity (KNN) search via Redis VSS to detect behavioral loops and inject system hints to break the cycle.
Streaming-Aware Accounting: Built for the modern LLM stack, handling TTFT (Time to First Token) metrics and automated refunds/adjustments for partial stream failures.
Resiliency (Fail-Open): Implements a 350ms P99 latency budget for guardrails. If the sidecar or vector store times out, the system degrades gracefully to preserve agent availability while maintaining financial rails.

What’s included

Proxy: rate limiting (Redis), cost tracking/refunds, TTFT/stream duration metrics, goroutine/runtime gauges, provider HTTP tracing/metrics.
Loop detection: gRPC over UDS to an embedding sidecar (ONNX MiniLM), Redis VSS store with HNSW, mean-pooled embeddings, configurable thresholds.
Telemetry: OTLP tracing/metrics ready for the provided collector; default dashboard notes in docs/METRICS_NOTES.md.
Docker: compose stack for proxy, Redis, Redis Stack (VSS), embedding sidecar, and OTel collector.

Quick start

Environment (.env):

GEMINI_API_KEY=...
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
TARGET_API=gemini   # or "openai" or "anthropic"
MODEL_URL=https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/resolve/main/onnx/model.onnx
MODEL_SHA256=6fd5d72fe4589f189f8ebc006442dbb529bb7ce38f8082112682524616046452

Bring up the stack:

docker compose up -d --build

Call the proxy (examples):

Gemini:

curl -X POST http://localhost:8080/v1beta/models/gemini-2.0-flash:generateContent \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: demo-tenant" \
  -d '{ "contents": [{ "parts": [{ "text": "Say hello in 3 languages" }] }] }'

OpenAI:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: demo-tenant" \
  -d '{ "model": "gpt-4o-mini", "messages": [{ "role": "user", "content": "Say hello in 3 languages" }] }'

Anthropic:

curl -X POST http://localhost:8080/v1/messages \
  -H "Content-Type: application/json" \
  -H "X-Tenant-ID: demo-tenant" \
  -d '{ "model": "claude-3-5-haiku-latest", "max_tokens": 1024, "messages": [{ "role": "user", "content": "Say hello in 3 languages" }] }'

Testing

Unit and integration tests:

go test ./...

Proxy integration (stubbed sidecar/provider, requires Redis):

REDIS_URL_INTEGRATION=redis://localhost:6379 go test ./internal/integration -count=1

Full-stack integration (real sidecar over UDS, stub provider):

RUN_FULLSTACK_SIDECAR=1 \
LOOP_EMBEDDING_SIDECAR_UDS=/Users/gaurangpatel/Documents/dev/agent-sentinel/.sockets/embedding-sidecar.sock \
REDIS_URL_INTEGRATION=redis://localhost:6379 \
go test ./internal/integration -count=1

More docs

Proxy usage, curl flows, and testing: docs/PROXY_USAGE.md
Metrics reference: docs/METRICS_NOTES.md
Embedding sidecar details: embedding-sidecar/models/README.md

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
.github/workflows		.github/workflows
docs		docs
embedding-sidecar		embedding-sidecar
internal		internal
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum
main.go		main.go
otel-collector-config.yaml		otel-collector-config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Sentinel

3-minute walkthrough

Key Architectural Decisions

What’s included

Quick start

Testing

More docs

About

Uh oh!

Releases

Packages

Languages

mastergaurang94/agent-sentinel

Folders and files

Latest commit

History

Repository files navigation

Agent Sentinel

3-minute walkthrough

Key Architectural Decisions

What’s included

Quick start

Testing

More docs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages