Your RAM is the new context window.
Aleph is an MCP server that gives any LLM access to gigabytes of local data without consuming context. Load massive files into a Python process—the model explores them via search, slicing, and sandboxed code execution. Only results enter the context window, never the raw content.
Based on the Recursive Language Model (RLM) architecture.
| Scenario | What Aleph Does |
|---|---|
| Large log analysis | Load 500MB of logs, search for patterns, correlate across time ranges |
| Codebase navigation | Load entire repos, find definitions, trace call chains, extract architecture |
| Data exploration | JSON exports, CSV files, API responses—explore interactively with Python |
| Mixed document ingestion | Load PDFs, Word docs, HTML, and logs like plain text |
| Semantic search | Find relevant sections by meaning, then zoom in with peek |
| Research sessions | Save/resume sessions, track evidence with citations, spawn sub-queries |
- Python 3.10+
- For MCP mode: An MCP-compatible client (Claude Code, Cursor, VS Code, Windsurf, Codex CLI, or Claude Desktop)
- For CLI mode:
claude,codex, orgeminiCLI installed
pip install "aleph-rlm[mcp]"This installs three commands:
| Command | Purpose |
|---|---|
aleph |
MCP server — connect from any MCP client |
alef |
Standalone CLI — run RLM loops directly from your terminal |
aleph-rlm |
Setup utility — auto-configure MCP clients |
Option A: MCP Mode (recommended for AI assistants)
Configure your MCP client to use the aleph server, then interact via tool calls.
Option B: CLI Mode (standalone terminal use)
Run alef directly from the command line — no MCP setup required.
Automatic (recommended):
aleph-rlm installThis auto-detects your installed clients and configures them.
Manual (any MCP client):
{
"mcpServers": {
"aleph": {
"command": "aleph",
"args": ["--enable-actions", "--workspace-mode", "any"]
}
}
}Config file locations
| Client | macOS/Linux | Windows |
|---|---|---|
| Claude Code | ~/.claude/settings.json |
%USERPROFILE%\.claude\settings.json |
| Claude Desktop | ~/Library/Application Support/Claude/claude_desktop_config.json |
%APPDATA%\Claude\claude_desktop_config.json |
| Cursor | ~/.cursor/mcp.json |
%USERPROFILE%\.cursor\mcp.json |
| VS Code | ~/.vscode/mcp.json |
%USERPROFILE%\.vscode\mcp.json |
| Codex CLI | ~/.codex/config.toml |
%USERPROFILE%\.codex\config.toml |
See MCP_SETUP.md for detailed instructions.
In your assistant, run:
get_status()
If using Claude Code, tools are prefixed: mcp__aleph__get_status.
The alef command runs the full RLM reasoning loop directly from your terminal. It uses local CLI tools (claude, codex, or gemini) as the LLM backend — no separate Aleph API keys needed, just the CLI tool's own authentication.
Prerequisites: Have claude, codex, or gemini CLI installed and authenticated.
# Simple query
alef run "What is 2+2?" --provider cli --model claude
# With context from a file
alef run "Summarize this log" --provider cli --model claude --context-file app.log
# JSON context
alef run "Extract all names" --provider cli --model claude --context '{"users": [{"name": "Alice"}, {"name": "Bob"}]}'
# Full JSON output with trajectory
alef run "Analyze this data" --provider cli --model claude --context-file data.json --json --include-trajectoryEnable recursive sub-queries where the LLM spawns additional Claude calls:
# Enable Claude CLI for sub-queries
export ALEPH_SUB_QUERY_BACKEND=claude
# Run a complex analysis that uses sub_query()
alef run "For each item in the context, use sub_query to summarize it, then combine results" \
--provider cli --model claude \
--context '{"items": [{"name": "Alice", "score": 95}, {"name": "Bob", "score": 87}]}' \
--max-iterations 10The RLM loop will:
- Execute Python code blocks to explore the context
- Call
sub_query()which spawns additional Claude CLI processes - Iterate until
FINAL(answer)is reached
| Flag | Description |
|---|---|
--provider cli |
Use local CLI tools instead of API |
--model claude|codex|gemini |
Which CLI backend to use |
--context "..." |
Inline context string |
--context-file path |
Load context from file |
--context-stdin |
Read context from stdin |
--json |
Output JSON response |
--include-trajectory |
Include full reasoning trace in JSON |
--max-iterations N |
Limit RLM loop iterations |
| Variable | Description |
|---|---|
ALEPH_SUB_QUERY_BACKEND |
Backend for sub_query(): claude, codex, gemini, or api |
ALEPH_SUB_QUERY_SHARE_SESSION |
Share MCP session with sub-agents (set to 1) |
ALEPH_CLI_TIMEOUT |
Timeout for CLI calls (default: 120s) |
Aleph enables multi-agent coordination through shared contexts. Multiple agents can read and write to the same context IDs, creating a distributed memory layer for swarm architectures.
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Agent A │ │ Agent B │ │ Agent C │
│ (Explorer) │ │ (Analyst) │ │ (Writer) │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└───────────────────┼───────────────────┘
│
┌──────▼──────┐
│ Aleph │
│ Contexts │
│ (Shared RAM)│
└─────────────┘
Agents coordinate by reading/writing to shared context IDs. No message passing needed for data—agents simply load, search, and write to the same contexts.
| Pattern | Purpose | Example |
|---|---|---|
swarm-{name}-kb |
Shared knowledge base | swarm-docs-kb |
task-{id}-spec |
Task requirements | task-42-spec |
task-{id}-findings |
Shared discoveries | task-42-findings |
{agent}-workspace |
Private agent workspace | explorer-workspace |
1. Leader creates shared context:
load_context(content="Project: Analyze auth system", context_id="swarm-auth-kb")2. Spawn agents with Aleph access:
# Each agent connects to the same Aleph MCP server
# They can all access "swarm-auth-kb"3. Agents write findings to shared context:
# Agent A finds something
exec_python(code="""
finding = "Auth uses JWT with RS256"
ctx_append(finding) # Appends to current context
""", context_id="task-42-findings")4. Agents read each other's work:
search_context(pattern="JWT|token", context_id="task-42-findings")5. Diff and merge contexts:
diff_contexts(a="agent-a-workspace", b="agent-b-workspace")Swarms can accumulate learnings across sessions:
# After completing a task, log what worked
exec_python(code="""
learning = '''
## Pattern: Parallel Code Search
- Split codebase by directory
- Each agent searches one area
- Merge findings to shared context
- 3x faster than sequential
'''
ctx_append(learning)
""", context_id="swarm-kb")
# Save for next session
save_session(context_id="swarm-kb", path="swarm_learnings.json")| Variable | Description |
|---|---|
ALEPH_SUB_QUERY_SHARE_SESSION |
Set to 1 to let sub-agents access parent's MCP session |
ALEPH_SUB_QUERY_BACKEND |
Backend for sub_query(): claude, codex, gemini, or api |
Parallel Exploration:
# Spawn multiple agents, each with a different context_id
# Agent 1: context_id="explore-frontend"
# Agent 2: context_id="explore-backend"
# All write findings to: context_id="task-findings"Consensus Building:
# Each agent writes proposal to task-proposals
# Use diff_contexts to compare
# Synthesize with sub_alephKnowledge Propagation:
Discovery → Private Workspace → Validate → Shared Context → Knowledge Base
See the /swarm skill for detailed prompts and examples.
Paste this into any AI coding assistant or agentic coder to add Aleph (MCP server + /aleph skill):
You are an AI coding assistant. Please set up Aleph (Model Context Protocol / MCP).
1) Add the Aleph MCP server config:
{
"mcpServers": {
"aleph": {
"command": "aleph",
"args": ["--enable-actions", "--workspace-mode", "any"]
}
}
}
2) Install the /aleph skill prompt:
- Claude Code: copy docs/prompts/aleph.md -> ~/.claude/commands/aleph.md
- Codex CLI: copy docs/prompts/aleph.md -> ~/.codex/skills/aleph/SKILL.md
- Gemini CLI: copy docs/prompts/aleph.md -> ~/.gemini/skills/aleph/SKILL.md
Ensure ~/.gemini/settings.json has "experimental": { "skills": true } and restart.
If this client uses a different skill/command folder, ask me where to place it.
3) Verify: run get_status() or list_contexts().
If tools are namespaced, use mcp__aleph__get_status or mcp__aleph__list_contexts.
4) (Optional) Enable sub_query (recursive sub-agent):
- Quick: just say "use claude backend" — the LLM will run set_backend("claude")
- Env var: set ALEPH_SUB_QUERY_BACKEND=claude|codex|gemini|api
- API backend: set ALEPH_SUB_QUERY_API_KEY + ALEPH_SUB_QUERY_MODEL
Runtime switching: the LLM can call set_backend() or configure() anytime—no restart needed.
5) Use the skill: /aleph (Claude Code) or $aleph (Codex CLI).
Gemini CLI: /skills list (use /skills enable aleph if disabled).
The /aleph skill is a prompt that teaches your LLM how to use Aleph effectively. It provides workflow patterns, tool guidance, and troubleshooting tips.
Note: Aleph works best when paired with the skill prompt + MCP server together.
- Loads files into searchable in-memory contexts
- Tracks evidence with citations as you reason
- Supports semantic search and fast rg-based codebase search
- Enables recursive sub-queries for deep analysis
- Persists sessions for later resumption (memory packs)
Just point at a file:
/aleph path/to/huge_log.txt
The LLM will load it into Aleph's external memory and immediately start analyzing using RLM patterns—no extra setup needed.
| Client | Command |
|---|---|
| Claude Code | /aleph |
| Codex CLI | $aleph |
For other clients, copy docs/prompts/aleph.md and paste it at session start.
Option 1: Direct download (simplest)
Download docs/prompts/aleph.md and save it to:
- Claude Code:
~/.claude/commands/aleph.md(macOS/Linux) or%USERPROFILE%\.claude\commands\aleph.md(Windows) - Codex CLI:
~/.codex/skills/aleph/SKILL.md(macOS/Linux) or%USERPROFILE%\.codex\skills\aleph\SKILL.md(Windows)
Option 2: From installed package
macOS/Linux
# Claude Code
mkdir -p ~/.claude/commands
cp "$(python -c "import aleph; print(aleph.__path__[0])")/../docs/prompts/aleph.md" ~/.claude/commands/aleph.md
# Codex CLI
mkdir -p ~/.codex/skills/aleph
cp "$(python -c "import aleph; print(aleph.__path__[0])")/../docs/prompts/aleph.md" ~/.codex/skills/aleph/SKILL.mdWindows (PowerShell)
# Claude Code
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.claude\commands"
$alephPath = python -c "import aleph; print(aleph.__path__[0])"
Copy-Item "$alephPath\..\docs\prompts\aleph.md" "$env:USERPROFILE\.claude\commands\aleph.md"
# Codex CLI
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.codex\skills\aleph"
Copy-Item "$alephPath\..\docs\prompts\aleph.md" "$env:USERPROFILE\.codex\skills\aleph\SKILL.md"┌───────────────┐ tool calls ┌────────────────────────┐
│ LLM client │ ────────────────► │ Aleph (Python, RAM) │
│ (limited ctx) │ ◄──────────────── │ search/peek/exec │
└───────────────┘ small results └────────────────────────┘
- Load —
load_context(paste text) orload_file(from disk) - Explore —
search_context,semantic_search,peek_context - Compute —
exec_pythonwith 100+ built-in helpers - Reason —
think,evaluate_progress,get_evidence - Persist —
save_sessionto resume later
# Load log data
load_context(content=logs, context_id="logs")
# → "Context loaded 'logs': 445 chars, 7 lines, ~111 tokens"
# Search for errors
search_context(pattern="ERROR", context_id="logs")
# → Found 2 match(es):
# Line 1: 2026-01-15 10:23:45 ERROR [auth] Failed login...
# Line 4: 2026-01-15 10:24:15 ERROR [db] Connection timeout...
# Extract structured data
exec_python(code="emails = extract_emails(); print(emails)", context_id="logs")
# → [{'value': 'user@example.com', 'line_num': 0, 'start': 50, 'end': 66}, ...]Multi-Context Workflow (code + docs + diffs)
Load multiple sources, then compare or reconcile them:
# Load a design doc and a repo snapshot (or any two sources)
load_context(content=design_doc_text, context_id="spec")
rg_search(pattern="AuthService|JWT|token", paths=["."], load_context_id="repo_hits", confirm=true)
# Compare or reconcile
diff_contexts(a="spec", b="repo_hits")
search_context(pattern="missing|TODO|mismatch", context_id="repo_hits")Advanced Querying with exec_python
Treat exec_python as a reasoning tool, not just code execution:
# Example: extract class names or key sections programmatically
exec_python(code="print(extract_classes())", context_id="repo_hits")Core (always available):
load_context,list_contexts,diff_contexts— manage in-memory datasearch_context,semantic_search,peek_context,chunk_context— explore data; usesemantic_searchfor concepts/fuzzy queries,search_contextfor precise regexexec_python,get_variable— compute in sandbox (100+ built-in helpers)think,evaluate_progress,summarize_so_far,get_evidence,finalize— structured reasoningtasks— lightweight task tracking per contextget_status— session statesub_query— spawn recursive sub-agents (CLI or API backend)sub_aleph— nested Aleph recursion (RLM -> RLM)
exec_python helpers
The sandbox includes 100+ helpers that operate on the loaded context:
| Category | Examples |
|---|---|
| Extractors (25) | extract_emails(), extract_urls(), extract_dates(), extract_ips(), extract_functions() |
| Statistics (8) | word_count(), line_count(), word_frequency(), ngrams() |
| Line operations (12) | head(), tail(), grep(), sort_lines(), columns() |
| Text manipulation (15) | replace_all(), between(), truncate(), slugify() |
| Validation (7) | is_email(), is_url(), is_json(), is_numeric() |
| Core | peek(), lines(), search(), chunk(), cite(), sub_query(), sub_aleph(), sub_query_map(), sub_query_batch(), sub_query_strict(), ctx_append(), ctx_set() |
Extractors return list[dict] with keys: value, line_num, start, end.
Action tools (requires --enable-actions):
load_file,read_file,write_file— filesystem (PDFs, Word, HTML, .gz supported)run_command,run_tests,rg_search— shell + fast repo searchsave_session,load_session— persist state (memory packs)add_remote_server,list_remote_tools,call_remote_tool— MCP orchestration
Workspace controls:
--workspace-root <path>— root for relative paths (default: git root from invocation cwd)--workspace-mode <fixed|git|any>— path restrictions--require-confirmation— requireconfirm=trueon action callsALEPH_WORKSPACE_ROOT— override workspace root via environment
Limits:
--max-file-size— max file read (default: 1GB)--max-write-bytes— max file write (default: 100MB)--timeout— sandbox/command timeout (default: 60s)--max-output— max command output (default: 50,000 chars)
Recursion budgets (depth/time/detail):
ALEPH_MAX_DEPTH(default: 2) — maxsub_alephnesting depthALEPH_MAX_ITERATIONS(default: 100) — total RLM loop steps (root + recursion)ALEPH_MAX_WALL_TIME(default: 300s) — wall-time cap per Aleph runALEPH_MAX_SUB_QUERIES(default: 100) — totalsub_querycalls allowedALEPH_MAX_TOKENS(default: unset) — optional per-call output cap
Override these via env vars above or per-call args on sub_aleph. CLI backends run
sub_aleph as a single-shot call; use the API backend for full multi-iteration recursion.
See docs/CONFIGURATION.md for all options.
- MCP_SETUP.md — client configuration
- docs/CONFIGURATION.md — CLI flags and environment variables
- docs/prompts/aleph.md — skill prompt and tool reference
- CHANGELOG.md — release history
- DEVELOPMENT.md — contributing guide
git clone https://github.com/Hmbown/aleph.git
cd aleph
pip install -e ".[dev,mcp]"
pytestRecursive Language Models
Zhang, A. L., Kraska, T., & Khattab, O. (2025)
arXiv:2512.24601
MIT