sancho

Data-structure-powered tooling for LLM agents.

sancho is a Rust workspace that augments coding agents with classical advanced data structures. The primary product is sancho-mcp, an MCP server that gives Copilot (and any MCP-compatible agent) deduplication, context trimming, claim verification, pattern lookup, and session metrics — all backed by battle-tested data structures from Brass's Advanced Data Structures.

Why sancho?

Problem	sancho solution	Data structure
Agent re-explores files it already saw	`check_seen` / `cache_response` dedup	Cuckoo filter + Count-Min Sketch
Context window fills up with stale text	`trim_context` removes low-value tokens	Count-Min Sketch (frequency)
Agent claims "code does X" without checking	`register_claim` / `verify_claim` pipeline	Compressed Trie + Persistent RB-tree
Repeated prefix searches across files	`find_pattern` for O(m) lookup	Dynamic Suffix Tree (Ukkonen)
No checkpoint/rollback across agent turns	`checkpoint` / session versioning	Persistent Red-Black Tree

Architecture

┌──────────────────────────────────────────────────┐
│  Agent (Copilot / any MCP client)                │
│  ↕ stdio JSON-RPC 2.0                           │
├──────────────────────────────────────────────────┤
│  sancho-mcp                                      │
│  ┌────────────┐ ┌──────────┐ ┌────────────────┐ │
│  │ Dedup tools│ │ Trim/Find│ │ Claim/Contract │ │
│  │  (Bloom +  │ │ (CountMin│ │ Verification   │ │
│  │  CMS)      │ │ + Suffix)│ │ (Trie + RBTree)│ │
│  └────────────┘ └──────────┘ └────────────────┘ │
├──────────────────────────────────────────────────┤
│  sancho-core  (zero I/O, no async, pure DS)      │
└──────────────────────────────────────────────────┘

Quickstart

Prerequisites

Rust stable (MSRV: 1.80)

Build and test

cargo test --workspace

Run the MCP server

cargo run -p sancho-mcp

Wire into VS Code / Copilot

This repo ships .vscode/mcp.json — open the workspace and the MCP server is auto-discovered.

Or add to your editor's MCP config:

{
  "servers": {
    "sancho": {
      "type": "stdio",
      "command": "cargo",
      "args": ["run", "-p", "sancho-mcp"]
    }
  }
}

Workspace crates

Crate	Purpose	Publish
`sancho-core`	Pure data structures (suffix tree, sketches, filters, persistent trees)	✅ crates.io
`sancho-mcp`	MCP server — 25 tools over stdio JSON-RPC 2.0	✅ crates.io
`sancho-proxy`	Ollama-compatible inference proxy (reference architecture)	internal
`sancho-cli`	Proxy binary entrypoint	internal
`sancho-candle`	Experimental Candle inference runner (research)	internal

Core data structures

All implementations cite the relevant chapter from Brass, Advanced Data Structures (Cambridge University Press):

Cuckoo / Counting Bloom filter — [Brass Ch 11] — probabilistic membership
Count-Min Sketch — [Brass Ch 11] — frequency estimation
Compressed Trie — [Brass Ch 8.1] — Patricia / compressed prefix tree
Persistent Red-Black Tree — [Brass Ch 7.2] — fully persistent ordered map
Dynamic Suffix Tree — [Brass Ch 8.4] — Ukkonen's online construction

Every data structure has property-based tests via proptest and NEON SIMD acceleration on Apple Silicon where applicable.

MCP tools (14 total)

Tool	What it does
`check_seen`	Dedup check — has the agent seen this input before?
`cache_response`	Store a response for future dedup hits
`trim_context`	Remove low-frequency tokens to fit context window
`find_pattern`	O(m) suffix-tree pattern search
`classify_task`	Route task to appropriate handler via trie
`checkpoint`	Save/restore session state (persistent RB-tree)
`register_claim`	Declare what code/tool does
`register_contract`	Define constraints a claim must satisfy
`record_observed_effects`	Log runtime side effects as evidence
`ingest_trace_summary`	Import execution trace as evidence
`verify_claim`	Compare claim against contract + evidence
`explain_mismatch`	Human-readable explanation of verification failures
`set_rollout_mode`	Control tool activation policy
`session_stats`	Session-level metrics and hit rates

Observer backends (runtime verification)

The verification pipeline supports multiple evidence backends:

inproc (recommended): unprivileged, adapter-supplied side effects
dtrace (optional): macOS-only, privileged local diagnostics
dry-run: synthetic evidence for CI and smoke testing

# In-process observation (default)
python3 scripts/mcp_observer_pipeline.py \
  --observer-backend inproc \
  --claim-id claim-1 \
  --contract-id contract-1 \
  --inproc-effect file.open:/tmp/out.txt

# Capture effects from a running command
python3 scripts/inproc_observe_command.py \
  --run-pipeline --emit-spawn-effect \
  -- python3 your_script.py

Language adapter helpers: Python, TypeScript, and Node.js adapters in scripts/inproc_adapters/.

Python client

from sancho_py import SanchoClient

async with SanchoClient() as client:
    result = await client.call_tool("check_seen", {"input_hash": "abc123"})

See docs/python-adapter.md for full documentation.

Stability and release policy

Versioning follows SemVer.
Breaking changes only in major versions.
Release notes in CHANGELOG.md.

Security

Please report security issues via GitHub Security Advisories. See SECURITY.md for details.

Contributing

See CONTRIBUTING.md and CODE_OF_CONDUCT.md.

License

MIT — see LICENSE-MIT.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github		.github
.vscode		.vscode
adapters/typescript		adapters/typescript
benchmarks		benchmarks
crates		crates
docs		docs
sancho_py		sancho_py
scripts		scripts
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LANES.toml		LANES.toml
LICENSE-MIT		LICENSE-MIT
README.md		README.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sancho

Why sancho?

Architecture

Quickstart

Prerequisites

Build and test

Run the MCP server

Wire into VS Code / Copilot

Workspace crates

Core data structures

MCP tools (14 total)

Observer backends (runtime verification)

Python client

Stability and release policy

Security

Contributing

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

sancho

Why sancho?

Architecture

Quickstart

Prerequisites

Build and test

Run the MCP server

Wire into VS Code / Copilot

Workspace crates

Core data structures

MCP tools (14 total)

Observer backends (runtime verification)

Python client

Stability and release policy

Security

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages