Cerberus MCP

Model Context Protocol server for intelligent code exploration with AST-based navigation and persistent session memory.

⚠️ Important Notice

Cerberus is in early development and highly experimental. While functional, it has rough edges:

Large Project Performance: Indexing time scales linearly with codebase size (~8 minutes for 6200 files with semantic search enabled, ~45 seconds without embeddings)
Memory Usage: Semantic search model requires ~400MB RAM (bundled all-MiniLM-L6-v2)
Breaking Changes: API and tool signatures may change as we refine features
Limited Testing: Not all edge cases are covered; bugs expected in complex scenarios
Resource Intensive: FAISS + embedding generation can heat up laptops on large codebases

Use at your own risk. Not recommended for production workflows yet.

⚡ Core Features

Feature	Description	Real Performance
🔍 AST-Based Search	Navigate code via Abstract Syntax Trees + semantic search (enabled by default with bundled model).	Semantic: 0.3-0.5 similarity scores. Keyword fallback if embeddings unavailable.
🧠 Persistent Memory	Dual-layer system: Global Preferences (everywhere) and Project Decisions (context-specific).	SQLite-backed, FTS5 search, ~50-200 tokens per memory context load.
📉 Token Efficiency	Skeletonization strips implementations, context assembly fetches only needed code.	Measured: 60-75% savings vs reading full files (not 90%, that was exaggerated).

📊 Real-World Benchmarks

Tested on Cerberus itself (6,206 files, 93,643 symbols):

Operation	Time	Tokens	Notes
Index Build (with embeddings)	~8.3 minutes	N/A	FAISS + all-MiniLM-L6-v2 generation
Index Build (without embeddings)	~45 seconds	N/A	SQLite only, keyword search fallback
Semantic Search (5 results)	<1s	~450 tokens	Includes similarity scores
Get Symbol (code only)	<0.5s	~1,300 tokens	75% vs full file read (~5,355 tokens)
Context (with deps)	<1s	~2,000-4,000 tokens	Depends on symbol complexity

Token Efficiency Reality Check:

Claim: "90% savings"
Reality: 60-75% savings in typical workflows (still significant!)
Why less? Full file reads rarely needed; comparison isn't apples-to-apples

🛠️ Tech Stack

🚀 Quick Start

Prerequisites: Python 3.10+, ~1GB disk space for index, 500MB RAM for semantic search

1. Install MCP Server

pip install git+https://github.com/proxikal/cerberus-mcp.git

2. Configure Your AI Agent

Add to your MCP configuration (e.g., claude_desktop_config.json):

{
  "mcpServers": {
    "cerberus": {
      "command": "cerberus-mcp"
    }
  }
}

3. Install Agent Skill (Recommended)

# Deploys specialized prompts that help agents use Cerberus effectively
mkdir -p ~/.claude/skills/Cerberus
cp -r skill/Cerberus/* ~/.claude/skills/Cerberus/

4. First Run

# In your AI agent (e.g., Claude Code):
# The agent will automatically call index_build on first use
# Expect 1-10 minutes depending on codebase size

🤖 Optional: Local LLM Summarization

Cerberus supports zero-token file summarization using local LLMs via llm-toolchain.

Installation

# Install llm-toolchain (optional)
pip install llm-toolchain

# Requires Ollama running locally
# Download from: https://ollama.ai

Configuration

Add to your cerberus.toml:

[summarization]
enabled = true
model = "qwen3:8b"  # Best for tool calling (or any Ollama model)
ollama_url = "http://localhost:11434"

How it works:

If llm-toolchain installed → uses zero-token file operations
If not installed → falls back to direct file reads (standard behavior)
If Ollama unavailable → summarization gracefully disabled

📚 Documentation Portal

We maintain detailed documentation in our Wiki.

Section	Content
🚀 Getting Started	Installation, Configuration, and First Steps.
🛠️ MCP Tools	Reference for all 56 tools (Search, Analysis, Memory).
🧠 Memory System	How the dual-layer preference and decision engine works.
⚡ Efficiency	Real-world token savings breakdowns.

🌐 Language Support

Cerberus supports multiple programming languages with varying feature coverage:

✅ Universal Features (All Languages)

These tools work across Python, TypeScript, JavaScript, Go, and other languages:

Core Navigation: search, get_symbol, context, blueprint, read_range, file_info, skeletonize
Dependency Analysis: deps, call_graph, analyze_impact, test_coverage
Project Tools: project_summary, smart_update, index_build
Memory System: All memory tools (memory_learn, memory_search, memory_context, etc.)

🔄 Multi-Language Support

find_circular_deps: Python, TypeScript, JavaScript (detects import cycles)

🔄 Enhanced Multi-Language Features

check_pattern: Python + TypeScript/JavaScript patterns (Python more mature)
style_check: Python-optimized (basic TS/JS support)
related_changes: Full TypeScript/JavaScript type support

📝 Notes

Parser: Uses tree-sitter for multi-language AST parsing
Semantic Search: Language-agnostic (works with all indexed code)
Import Resolution: Supports Python imports, ES6 modules, CommonJS

🎯 When to Use Cerberus

Good fits:

Large codebases (>1000 files) where manual navigation is slow
Projects with complex dependency graphs
Multi-session work where context needs to persist
AI agent workflows with strict token budgets

Not ideal for:

Small projects (<100 files) - setup overhead not worth it
First-time exploration - initial indexing takes time
Read-only analysis of unfamiliar code - index may not help much
Projects with rapidly changing file structure - re-indexing frequent

🤝 Contributing

Contributions welcome! See issues for current priorities.

Known Issues:

Semantic search indexing is slow on large codebases (working on it)
Memory system needs better conflict resolution
Some tree-sitter parsers are incomplete for newer language features

Development setup:

git clone https://github.com/proxikal/cerberus-mcp.git
cd cerberus-mcp
pip install -e ".[dev]"
pytest tests/ -v

📜 License

This project is licensed under the MIT License.

Built for AI agents. Optimized for efficiency. Experimental by nature.

Name		Name	Last commit message	Last commit date
Latest commit History 202 Commits
.github		.github
models/all-MiniLM-L6-v2		models/all-MiniLM-L6-v2
phases/archive		phases/archive
skill/Cerberus		skill/Cerberus
src/cerberus		src/cerberus
test_data		test_data
tests		tests
wiki-content		wiki-content
.env.example		.env.example
.gitignore		.gitignore
.hunter-findings.md		.hunter-findings.md
BULK-OPERATIONS.md		BULK-OPERATIONS.md
README.md		README.md
cerberus.toml		cerberus.toml
cerberus.toml.example		cerberus.toml.example
echo-typescript-audit-2026-02-10.md		echo-typescript-audit-2026-02-10.md
hot-reload.config.json		hot-reload.config.json
publish-wiki.sh		publish-wiki.sh
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-enterprise.txt		requirements-enterprise.txt
requirements.txt		requirements.txt
test_fixes.py		test_fixes.py
test_mcp_tools.py		test_mcp_tools.py
test_retrieval.py		test_retrieval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cerberus MCP

⚠️ Important Notice

⚡ Core Features

📊 Real-World Benchmarks

🛠️ Tech Stack

🚀 Quick Start

1. Install MCP Server

2. Configure Your AI Agent

3. Install Agent Skill (Recommended)

4. First Run

🤖 Optional: Local LLM Summarization

Installation

Configuration

📚 Documentation Portal

🌐 Language Support

✅ Universal Features (All Languages)

🔄 Multi-Language Support

🔄 Enhanced Multi-Language Features

📝 Notes

🎯 When to Use Cerberus

🤝 Contributing

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Cerberus MCP

⚠️ Important Notice

⚡ Core Features

📊 Real-World Benchmarks

🛠️ Tech Stack

🚀 Quick Start

1. Install MCP Server

2. Configure Your AI Agent

3. Install Agent Skill (Recommended)

4. First Run

🤖 Optional: Local LLM Summarization

Installation

Configuration

📚 Documentation Portal

🌐 Language Support

✅ Universal Features (All Languages)

🔄 Multi-Language Support

🔄 Enhanced Multi-Language Features

📝 Notes

🎯 When to Use Cerberus

🤝 Contributing

📜 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages