Agentic RAG SDK

Accuracy-first retrieval infrastructure for grounding AI coding agents.

A production-ready RAG pipeline using Voyage AI embeddings, Qdrant vector database, and hybrid retrieval with cross-encoder reranking. Includes an MCP server for seamless integration with Claude Code, Amp, and other MCP-compatible AI assistants.

Currently indexes 13 corpora (~18,000 vectors) across major agentic AI SDKs: Google ADK, OpenAI Agents, LangChain/LangGraph, Anthropic Claude SDK, and CrewAI.

Secondary feature: 44 IDE-agnostic workflows for building agents with Google ADK.

MCP Server Integration

Expose the RAG pipeline as tools for AI coding agents via the Model Context Protocol.

Prerequisites

uv package manager
Voyage AI API Key
Qdrant Cloud cluster (free tier works)

Installation (BYOK - Bring Your Own Keys)

1. Clone and install:

git clone https://github.com/MattMagg/agentic-rag-sdk.git
cd agentic-rag-sdk
uv pip install -e .

2. Configure environment:

cp .env.example .env
# Edit .env with your API keys

3. Initialize the pipeline:

# Verify API connections
python -m grounding.scripts.00_smoke_test_connections

# Create Qdrant collection with schema
python -m grounding.scripts.02_ensure_collection_schema

# Ingest all corpora (or specific ones)
python -m grounding.scripts.03_ingest_corpus
python -m grounding.scripts.03_ingest_corpus --corpus adk_docs  # Single corpus

4. Add to your MCP client config:

For Claude Code / Amp, add to .mcp.json (or ~/.claude/mcp.json):

{
  "mcpServers": {
    "rag": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/agentic-rag-sdk", "rag-mcp-server"],
      "env": {
        "QDRANT_URL": "https://your-cluster.qdrant.io",
        "QDRANT_API_KEY": "your-qdrant-key",
        "VOYAGE_API_KEY": "your-voyage-key"
      }
    }
  }
}

Available MCP Tools

Tool	Description
`rag_search`	Full RAG search with reranking and context expansion
`rag_search_quick`	Fast retrieval without reranking
`rag_corpus_list`	List available corpora and SDK groups
`rag_corpus_info`	Get details about a specific corpus
`rag_diagnose`	Check system health (Qdrant, Voyage connections)
`rag_config_show`	Display current configuration

Usage Examples

Once configured, your AI assistant can use queries like:

"Search for how to implement multi-agent orchestration in ADK"
"Find documentation about LangGraph checkpoints"
"Look up Claude Agent SDK streaming examples"

Quick Start (CLI)

Prerequisites

Python 3.11+
Voyage AI API Key
Qdrant Cloud cluster (free tier works)

1. Clone and Install

git clone https://github.com/MattMagg/agentic-rag-sdk.git
cd agentic-rag-sdk
uv pip install -e .

2. Configure Environment

cp .env.example .env
# Edit .env with your API keys

Required variables:

VOYAGE_API_KEY="your-voyage-api-key"
QDRANT_URL="https://your-cluster.region.cloud.qdrant.io:6333"
QDRANT_API_KEY="your-qdrant-api-key"
QDRANT_COLLECTION="agentic_grounding_v1"

3. Initialize Pipeline

# Verify API connections
python -m grounding.scripts.00_smoke_test_connections

# Create Qdrant collection with schema
python -m grounding.scripts.02_ensure_collection_schema

# Ingest a corpus (e.g., ADK docs)
python -m grounding.scripts.03_ingest_corpus --corpus adk_docs

4. Query

# Query Google ADK
python -m grounding.query.query "How to implement multi-agent orchestration?" --sdk adk

# Query OpenAI Agents SDK
python -m grounding.query.query "How to create handoffs?" --sdk openai

# Query LangChain ecosystem
python -m grounding.query.query "How to use LangGraph checkpoints?" --sdk langchain

# Query Anthropic Claude Agent SDK
python -m grounding.query.query "How to create a Claude agent?" --sdk anthropic

# Query CrewAI Framework
python -m grounding.query.query "How to define a CrewAI crew?" --sdk crewai

# With verbose output and multi-query expansion
python -m grounding.query.query "your query" --verbose --multi-query

# With context expansion (fetch adjacent chunks for deeper understanding)
python -m grounding.query.query "your query" --expand-context --expand-top-k 5

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                           YOUR CORPUS                                   │
│   repos, docs, markdown, code, PDFs, text files, configs...            │
└────────────────────────────────┬────────────────────────────────────────┘
                                 │
                    ┌────────────▼────────────┐
                    │    DISCOVERY & CHUNK    │
                    │  • Smart file walking   │
                    │  • AST-based code split │
                    │  • Heading-aware docs   │
                    └────────────┬────────────┘
                                 │
          ┌──────────────────────┼──────────────────────┐
          ▼                      ▼                      ▼
   ┌─────────────┐       ┌─────────────┐       ┌─────────────┐
   │  Dense Vec  │       │  Dense Vec  │       │ Sparse Vec  │
   │voyage-ctx-3 │       │voyage-code-3│       │   SPLADE++  │
   │   (docs)    │       │   (code)    │       │  (lexical)  │
   └──────┬──────┘       └──────┬──────┘       └──────┬──────┘
          │                     │                     │
          └─────────────────────┼─────────────────────┘
                                ▼
                    ┌────────────────────────┐
                    │    QDRANT CLOUD        │
                    │  Named vector spaces:  │
                    │  • dense_docs (2048d)  │
                    │  • dense_code (2048d)  │
                    │  • sparse_lexical      │
                    │  + Rich payload index  │
                    └────────────┬───────────┘
                                 │
              ┌──────────────────┼──────────────────┐
              ▼                  ▼                  ▼
       ┌───────────┐      ┌───────────┐      ┌───────────┐
       │  Prefetch │      │  Prefetch │      │  Prefetch │
       │dense_docs │      │dense_code │      │  sparse   │
       └─────┬─────┘      └─────┬─────┘      └─────┬─────┘
             └──────────────────┼──────────────────┘
                                ▼
                    ┌────────────────────────┐
                    │   RRF / DBSF FUSION    │
                    │   (server-side)        │
                    └────────────┬───────────┘
                                 ▼
                    ┌────────────────────────┐
                    │  VOYAGE RERANK-2.5     │
                    │  instruction-following │
                    └────────────┬───────────┘
                                 ▼
                    ┌────────────────────────┐
                    │  CONTEXT EXPANSION     │
                    │  fetch adjacent chunks │
                    │  (±N around top-K)     │
                    └────────────┬───────────┘
                                 ▼
                    ┌────────────────────────┐
                    │   COVERAGE GATES       │
                    │  ensure docs/code mix  │
                    └────────────┬───────────┘
                                 ▼
                    ┌────────────────────────┐
                    │    EVIDENCE PACK       │
                    │  ranked, cited, ready  │
                    └────────────────────────┘

SDK Groups & Corpora

SDK Flag	Corpora	Description
`--sdk adk`	`adk_docs`, `adk_python`	Google Agent Development Kit
`--sdk openai`	`openai_agents_docs`, `openai_agents_python`	OpenAI Agents SDK
`--sdk langchain`	`langgraph_python`, `langchain_python`, `deepagents_python`, `deepagents_docs`	LangChain ecosystem
`--sdk langgraph`	`langgraph_python`, `deepagents_python`, `deepagents_docs`	LangGraph + DeepAgents
`--sdk anthropic`	`claude_sdk_docs`, `claude_sdk_python`	Anthropic Claude Agent SDK
`--sdk crewai`	`crewai_docs`, `crewai_python`	CrewAI multi-agent framework
`--sdk general`	`agent_dev_docs`	General agent development

Current stats: 13 corpora, ~18,000 vectors, 7 SDK filter groups

Adding Your Own Corpora

Step 1: Clone the Repository

cd corpora/
git clone https://github.com/your-org/your-repo.git

Step 2: Add Configuration

Edit config/settings.yaml and add an entry under ingestion.corpora:

your_corpus_name:
  root: "corpora/your-repo"
  corpus: "your_corpus_name"
  repo: "your-org/your-repo"
  kind: "doc"  # or "code"
  ref: "main"
  include_globs:
    - "docs/**/*.md"
    - "src/**/*.py"
  exclude_globs:
    - "**/.git/**"
    - "**/tests/**"
  allowed_exts: [".md", ".py"]
  max_file_bytes: 500000

Kind determines embedding model:

doc → voyage-context-3 (optimized for documentation)
code → voyage-code-3 (optimized for source code)

Step 3: Update Type Definition

Edit src/grounding/contracts/chunk.py and add to SourceCorpus:

SourceCorpus = Literal[
    "adk_docs",
    "adk_python",
    # ... existing corpora ...
    "your_corpus_name",  # Add here
]

Step 4: Update Query Module

Edit src/grounding/query/query.py:

# Add to ALL_CORPORA list
ALL_CORPORA = [
    # ... existing corpora ...
    "your_corpus_name",
]

# Optionally add to CORPUS_GROUPS for --sdk filtering
CORPUS_GROUPS = {
    # ... existing groups ...
    "your_sdk": ["your_corpus_name"],
}

Step 5: Ingest

python -m grounding.scripts.03_ingest_corpus --corpus your_corpus_name

Switching Embedding Models

Why This Repo Uses Voyage 3 Family

This pipeline uses voyage-context-3 (docs) and voyage-code-3 (code) because they are purpose-built for their respective content types. The specialized models provide better retrieval quality for code and documentation compared to general-purpose models.

Available Voyage Models

Voyage 3 family (used in this repo):

voyage-context-3 - Optimized for long-context documents (2048d)
voyage-code-3 - Optimized for source code (2048d)

Voyage 4 family (general-purpose, if you prefer):

voyage-4-large - Highest quality general model (1024d)
voyage-4 - Balanced quality/speed (1024d)
voyage-4-lite - Faster, smaller (512d)
voyage-4-nano - Fastest, smallest (512d)

How to Switch to Voyage 4

Update config/settings.yaml:

voyage:
  docs_model: "voyage-4-large"  # or voyage-4, voyage-4-lite
  code_model: "voyage-4-large"  # Voyage 4 doesn't have specialized code model
  output_dimension: 1024        # Voyage 4 uses 1024d (not 2048d)

Re-create Qdrant collection (dimensions must match):

# Delete existing collection first (via Qdrant console or API)
python -m grounding.scripts.02_ensure_collection_schema

Re-ingest all corpora:

python -m grounding.scripts.03_ingest_corpus

Configuration Reference

`config/settings.yaml`

qdrant:
  url: ${QDRANT_URL}
  api_key: ${QDRANT_API_KEY}
  collection: ${QDRANT_COLLECTION}

voyage:
  api_key: ${VOYAGE_API_KEY}
  docs_model: "voyage-context-3"
  code_model: "voyage-code-3"
  output_dimension: 2048
  rerank_model: "rerank-2.5"

retrieval_defaults:
  fusion_method: "dbsf"            # "dbsf" (Distribution-Based) or "rrf" (Reciprocal Rank)
  score_threshold: 0.0             # Filter results below this score (0 = disabled)
  top_k: 12                        # Final number of results
  first_stage_k: 80                # Candidates per prefetch lane
  rerank_candidates: 60            # Candidates sent to reranker
  group_by: "path"                 # Deduplicate by file path (one best chunk per file)
  group_size: 1                    # Max results per group

  # Context expansion (enabled by default for deeper understanding)
  context_expansion:
    enabled: true                  # Fetch adjacent chunks around top results
    expand_top_k: 5                # Number of top results to expand
    window_size: 1                 # ±1 = fetch N-1 and N+1 chunks
    score_decay_factor: 0.85       # Score multiplier for adjacent chunks
    max_expanded_chunks: 20        # Safety cap on total expanded chunks

Environment variable substitution (${VAR}) is supported throughout.

Hybrid Retrieval Strategy

Every query executes 3 parallel searches:

Search	Vector Space	Model	Purpose
Dense Docs	`dense_docs`	`voyage-context-3`	Semantic match for documentation
Dense Code	`dense_code`	`voyage-code-3`	Semantic match for code
Sparse	`sparse_lexical`	SPLADE++	Exact keyword/identifier match

Results are fused server-side using Distribution-Based Score Fusion (DBSF) by default, or Reciprocal Rank Fusion (RRF), then reranked with rerank-2.5. Results are deduplicated by file path to ensure one best chunk per source file.

Coverage Balancing

Before reranking, the pipeline ensures a balanced mix of documentation and code results. This prevents the reranker from seeing only one content type and ensures grounded evidence from both sources.

Context-Aware Expansion

Enabled by default for improved comprehension and continuity.

After reranking identifies the most relevant chunks, context expansion fetches adjacent chunks (±N) around the top-K results. This provides:

Contextual continuity: See surrounding code/documentation for better understanding
Reduced fragmentation: Adjacent chunks often contain setup, imports, or related explanations
Score inheritance: Adjacent chunks receive decayed scores (parent_score * decay_factor^distance)

Example: If chunk N=5 ranks #1 with score 0.90, context expansion fetches:

Chunk 4 (N-1): score = 0.90 × 0.85¹ = 0.765
Chunk 6 (N+1): score = 0.90 × 0.85¹ = 0.765

Performance: ~50-70ms overhead (~3-4% increase), single batch fetch for efficiency.

Configuration:

# Disable if needed (for latency-critical workloads)
python -m grounding.query.query "query" --expand-context False

# Adjust parameters
python -m grounding.query.query "query" \
    --expand-context \
    --expand-top-k 3 \      # Expand top 3 results (default: 5)
    --expand-window 2       # Fetch ±2 chunks (default: ±1)

Coding Agent Workflows

This repository includes 44 ADK development workflows in .agent/workflows/ for building agentic systems with Google Agent Development Kit.

Quick Usage

With Antigravity IDE, workflows are auto-detected:

/adk-master          # Master orchestrator
/adk-init            # Initialize new project
/adk-agents-create   # Create LlmAgent
/adk-tools-function  # Add FunctionTool

For other IDEs, copy .agent/workflows/ to your project and reference in your agent's system prompt.

Workflow Categories

Prefix	Purpose
`adk-init-*`	Project scaffolding
`adk-agents-*`	Agent creation (LlmAgent, BaseAgent)
`adk-tools-*`	Tool integration (FunctionTool, MCP, OpenAPI)
`adk-behavior-*`	Callbacks, state, events
`adk-multi-agent-*`	Delegation, orchestration, A2A
`adk-memory-*`	Memory services, grounding
`adk-streaming-*`	SSE, bidirectional, multimodal
`adk-deploy-*`	Cloud Run, GKE, Agent Engine
`adk-security-*`	Auth, guardrails
`adk-quality-*`	Logging, tracing, evals
`adk-advanced-*`	ThinkingConfig, visual builder

See .agent/workflows/adk-workflow/ for detailed workflow creation specs.

Project Structure

agentic-rag-sdk/
├── .agent/
│   ├── workflows/           # 44 ADK development workflows
│   │   ├── _schema.yaml     # Frontmatter schema
│   │   ├── _manifest.json   # Workflow index + dependencies
│   │   └── adk-*.md         # Individual workflows
│   └── scripts/             # Workflow tooling
├── config/
│   ├── settings.yaml        # Main configuration
│   └── logging.yaml         # Logging configuration
├── corpora/                 # Git-cloned source repositories
│   ├── adk-docs/            # Google ADK documentation
│   ├── adk-python/          # Google ADK Python SDK
│   ├── openai-agents-python/# OpenAI Agents SDK
│   ├── langgraph/           # LangGraph source
│   ├── langchain/           # LangChain source
│   ├── deepagents/          # DeepAgents source
│   ├── claude-agent-sdk-python/ # Anthropic Claude Agent SDK
│   ├── crewAI/              # CrewAI framework
│   └── agent-dev-docs/      # General agent docs
├── src/grounding/
│   ├── clients/             # Qdrant + Voyage client wrappers
│   ├── contracts/           # Pydantic models (Chunk, Document)
│   ├── chunkers/            # AST-based code + heading-aware docs
│   ├── query/               # Hybrid query + rerank pipeline
│   └── scripts/             # CLI commands (00-03)
├── docs/
│   ├── voyage-qdrant-rag-spec/  # 6 detailed spec files
│   └── rag-query.md             # Query tool documentation
└── tests/                   # pytest tests

Documentation

Document	Topic
Foundation & Environment	Setup, credentials, client wrappers
Qdrant Schema	Collection schema, HNSW config
Ingestion Pipeline	Chunking, embedding, upsert
Hybrid Query	Prefetch, fusion, tool interface
Rerank Retrieval	Voyage rerank, evidence packs
Corpus Targets	Corpus configuration

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest tests/

# Run single test
pytest tests/test_config_loads.py -v

Key Dependencies

Package	Purpose
`qdrant-client`	Vector DB operations
`voyageai`	Embeddings + reranking
`fastembed`	SPLADE sparse embeddings
`pydantic`	Data contracts

License

MIT License. See LICENSE for details.

Contributing

Contributions welcome! Please read the specs in docs/voyage-qdrant-rag-spec/ before modifying core retrieval logic.

For workflow contributions, follow existing patterns in .agent/workflows/ and ensure examples are grounded in official ADK documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
.agent/workflows		.agent/workflows
.github/workflows		.github/workflows
config		config
corpora		corpora
docs		docs
manifests		manifests
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
GEMINI.md		GEMINI.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

MattMagg/agentic-rag-sdk

Folders and files

Latest commit

History

Repository files navigation

Agentic RAG SDK

MCP Server Integration

Prerequisites

Installation (BYOK - Bring Your Own Keys)

Available MCP Tools

Usage Examples

Quick Start (CLI)

Prerequisites

1. Clone and Install

2. Configure Environment

3. Initialize Pipeline

4. Query

Architecture

SDK Groups & Corpora

Adding Your Own Corpora

Step 1: Clone the Repository

Step 2: Add Configuration

Step 3: Update Type Definition

Step 4: Update Query Module

Step 5: Ingest

Switching Embedding Models

Why This Repo Uses Voyage 3 Family

Available Voyage Models

How to Switch to Voyage 4

Configuration Reference

config/settings.yaml

Hybrid Retrieval Strategy

Coverage Balancing

Context-Aware Expansion

Coding Agent Workflows

Quick Usage

Workflow Categories

Project Structure

Documentation

Development

Key Dependencies

License

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

`config/settings.yaml`

Packages