Python realization. Clean API, in-process Python sandbox, and the most readable code medium examples.
This is the Python realization of the cantrip spec. It was generated from SPEC.md after the TypeScript reference implementation stabilized, then refined interactively as the spec evolved through v0.2 and v0.3. It implements the full domain model — cantrip, entity, circle, gates, wards, mediums, loom — in idiomatic Python with minimal dependencies.
For the full vocabulary and behavioral rules, see SPEC.md at the repo root.
cd py
pip install -e . # or: uv pip install -e .
cp .env.example .env # add your API keyRun the simplest meaningful example:
python examples/patterns/04_cantrip.pyRun all examples in scripted mode (no API key needed):
uv run pytest tests/test_grimoire_examples.py -qfrom cantrip import Cantrip, Circle, Identity, OpenAICompatLLM
# LLM — any OpenAI-compatible endpoint
llm = OpenAICompatLLM(model="gpt-4.1-mini", api_key="sk-...")
# Circle — gates + wards
circle = Circle(gates=["done"], wards=[{"max_turns": 10}])
# Identity — system prompt
identity = Identity(
system_prompt="You are a financial analyst. Call done(answer) with your summary."
)
# Cantrip — llm + identity + circle
spell = Cantrip(llm=llm, identity=identity, circle=circle)
# Cast it on an intent
result = spell.cast("Revenue up 14% QoQ, churn down 2 points. What does this mean?")
print(result)No medium specified — the circle defaults to conversation mode, where gates appear as JSON tool calls. Set medium="code" to upgrade the entity's action space to a Python sandbox.
The central object. Binds an LLM, an identity, and a circle into a reusable script.
spell = Cantrip(
llm=llm,
identity=Identity(system_prompt="..."),
circle=Circle(
gates=["done", "call_entity"],
wards=[{"max_turns": 10}, {"max_depth": 2}],
medium="code",
),
)
# One-shot
result = spell.cast("Analyze this data")
# With thread metadata
result, thread = spell.cast_with_thread("Analyze this data")
print(thread.turns, thread.terminated, thread.truncated)
# Streaming
for event in spell.cast_stream("Analyze this data"):
print(event)A summoned entity survives its first intent. State accumulates across sends.
entity = spell.summon()
first = entity.send("Set up the analysis framework")
second = entity.send("Now analyze Q3 revenue") # remembers the first sendThe capability envelope: medium + gates + wards.
# Conversation (default) — gates as JSON tool calls
Circle(gates=["done", "echo"], wards=[{"max_turns": 5}])
# Code medium — entity writes Python in a sandbox
Circle(gates=["done", "repo_read"], wards=[{"max_turns": 10}], medium="code")
# Gates with dependencies
Circle(
gates=["done", {"name": "repo_read", "depends": {"root": "/data"}}],
wards=[{"max_turns": 10}],
)Built-in gates: done, echo, read, repo_files, repo_read, call_entity, call_entity_batch, fetch.
Immutable configuration: system prompt + hyperparameters.
Identity(
system_prompt="You analyze code for bugs.",
require_done_tool=True, # entity must call done() explicitly
temperature=0.7,
)Append-only turn storage. Every turn is recorded before the next begins.
from cantrip import Loom, InMemoryLoomStore, SQLiteLoomStore
# In-memory (ephemeral)
loom = Loom(store=InMemoryLoomStore())
# Persistent to disk
loom = Loom(store=SQLiteLoomStore("loom.db"))
# Attach to a cantrip
spell = Cantrip(llm=llm, identity=identity, circle=circle, loom=loom)No medium specified. Gates appear as JSON tool calls — the LLM sees each gate as a separate tool definition. This is how most agent frameworks work.
Circle(gates=["done", "echo"], wards=[{"max_turns": 5}])The entity writes Python code that executes in-process via exec(). Gates are projected as host functions — done(), call_gate(), call_entity() — callable directly in the sandbox. Variables persist across turns.
Circle(
gates=["done", "repo_read"],
wards=[{"max_turns": 10}],
medium="code",
)In the sandbox, the entity writes:
# Turn 1
data = call_gate("repo_read", {"path": "metrics.txt"})
# Turn 2 — data persists from turn 1
lines = data.split("\n")
done(f"Found {len(lines)} metrics")The code medium uses InProcessPythonExecutor by default — Python's exec() with warded builtins and injected host functions. This gives the entity access to Python's full standard library within the sandbox, but isolation is best-effort (CPython threads can't be force-killed). For stronger isolation, use SubprocessPythonExecutor.
Adds browser automation via Playwright. Requires the browser optional dependency.
The entity delegates via call_entity in code medium. Delegation is synchronous — the parent blocks while the child runs.
spell = Cantrip(
llm=parent_llm,
child_llm=child_llm, # optional: different LLM for children
circle=Circle(
medium="code",
gates=["done", "call_entity"],
wards=[{"max_turns": 6}, {"max_depth": 2}],
),
identity=Identity(
system_prompt="Delegate tasks to children via call_entity.",
require_done_tool=True,
),
)Inside the code medium, the entity writes:
# call_entity is synchronous — blocks and returns the child's answer as a string
trends = call_entity({"intent": "Identify top 3 trends in this data..."})
risks = call_entity({"intent": "What are the biggest risks..."})
done(f"Trends: {trends}\nRisks: {risks}")Children get a generic system prompt and independent context (COMP-4). Delegation gates are stripped from children to prevent recursive delegation. Child max_turns is capped at 3.
Twelve examples in examples/patterns/, one for each grimoire pattern. Each example works in two modes: scripted (deterministic, no API key) and real (live LLM calls).
| # | Pattern | What it teaches |
|---|---|---|
| 01 | llm_query |
LLM as stateless query |
| 02 | gate |
Direct gate execution |
| 03 | circle |
Construction invariants (done gate, wards) |
| 04 | cantrip |
LLM + identity + circle = reusable script |
| 05 | wards |
Subtractive composition (min for numeric, OR for boolean) |
| 06 | medium |
Tool medium vs code medium — same gates, different action space |
| 07 | full_agent |
Code medium + filesystem gates + error steering |
| 08 | folding |
Context compression for long runs |
| 09 | composition |
call_entity + call_entity_batch |
| 10 | loom |
Inspect the append-only artifact |
| 11 | persistent_entity |
summon/send across episodes |
| 12 | familiar |
Persistent coordinator delegating through code |
Run any example:
python examples/patterns/04_cantrip.pyStrengths:
- Readable code medium examples. The Python examples are the clearest demonstration of the conversation-vs-code medium distinction. Example 06 shows the same gates producing different action spaces. Example 07 shows error steering in a code sandbox.
- In-process Python sandbox. The entity writes Python that runs via
exec()with injected host functions. This is the most natural code medium if you're building in Python — the entity writes the same language as the host. - Clean API surface.
Cantrip,Identity,Circle— three classes, frozen dataclasses, no framework magic. The public API is 18 symbols. - SQLite loom storage. The only implementation with SQLite as a loom backend (vs JSONL or in-memory). Good for persistent entities that need durable turn history.
- Protocol adapters. ACP (stdio), HTTP, and CLI adapters are all included and tested.
Limitations:
- One LLM provider.
OpenAICompatLLMonly — works with OpenAI, OpenRouter, and any OpenAI-compatible endpoint, but no native Anthropic or Google adapters. (The TS implementation has five providers.) - Two mediums. Conversation and code (plus browser with optional Playwright). No bash, VM, or other substrate mediums.
- In-process isolation only. The default
InProcessPythonExecutorusesexec()— no security boundary.SubprocessPythonExecutoris available but can't share gate functions across the process boundary. Neither is as isolated as QuickJS or node:vm. MiniCodeExecutoris vestigial. A minimal JS interpreter in Python, exported and tested but unlikely to be useful outside of cross-language test compatibility.
Tests: 227 pass, 2 skip (uv run pytest tests/ -q)
The test suite includes:
- Core lifecycle tests (entity, cantrip, circle, loom)
- Medium behavior tests (tool and code)
- End-to-end delegation tests
- Grimoire example tests (all 12 patterns)
- Spec MUST-rule coverage test (regex scan of SPEC.md rules vs implementation)
- Protocol adapter tests (ACP, HTTP, CLI)
Requires Python 3.11+.
pip install -e . # or: uv pip install -e .
cp .env.example .envDependencies: requests, PyYAML, agent-client-protocol. No heavy ML frameworks.
Set your API key:
OPENAI_API_KEY="sk-..."
OPENAI_MODEL="gpt-4.1-mini"
# Optional:
OPENAI_BASE_URL="https://api.openai.com/v1"Run tests:
uv run pytest tests/ -q