Skip to content

199-biotechnologies/context-priming

Repository files navigation

Context Priming

Proactive context synthesis for coding agents. Build the right context before the first token.

Star this repo Follow @longevityboris

Python 3.11+ License: MIT Claude Code


Coding agents start every task with the wrong context. They load static memory files, bloat the window with irrelevant code, or start cold with nothing at all. Then they hit the context limit and auto-compact throws away half of what they gathered.

Context Priming fixes this. It analyzes the task, scans all available sources, scores relevance, and synthesizes a compact starting context before the agent writes a single line of code. Constructive, not reductive. Task-specific, not one-size-fits-all.

Why This Exists | Before vs After | Install | Quick Start | How It Works | Features | Whitepaper | Contributing | License


Why This Exists

LLM coding agents manage context reactively. They wait until the window fills, then compress. This is backwards.

  • Auto-compaction fires at ~95% capacity. Quality has already degraded by then.
  • Memory files load wholesale regardless of what the task actually needs.
  • Static context (CLAUDE.md, AGENTS.md) is the same for every task.
  • RAG requires the agent to know what to ask for. It often doesn't.

The result: cold starts, context bloat, or context optimized for a different task entirely. Agents spend their best tokens figuring out what they should already know.

Before vs After

Without Priming

Task arrives
  → Agent starts cold (no relevant context)
  → Agent explores codebase (burns tokens on discovery)
  → Context window fills with exploration artifacts
  → Auto-compact fires (loses important details)
  → Agent works with degraded context

With Priming

Task arrives
  → Context Priming analyzes the task
  → Scans memories, codebase, git history, configs
  → Scores and filters for this specific task
  → Synthesizes compact, goal-aware starting context
  → Agent starts with exactly what it needs

The difference: agents spend tokens on the work, not on figuring out what the work is.

Install

pip install -e ".[anthropic]"
export ANTHROPIC_API_KEY=sk-...

Supports Python 3.11+. Optional extras: openai, claude-sdk, or all.

Quick Start

CLI

# Full priming pipeline
context-prime prime --task "Fix the auth middleware bug" --project ./myapp --verbose

# Gather sources only (see what's available)
context-prime gather --project ./myapp

# Prime and output as JSON
context-prime prime --task "Add pagination" --project . --format json

As a Library

from context_prime import gather_all, score_relevance, filter_relevant
from context_prime import infer_hierarchy, synthesize_context

# Bring your own LLM call
def my_llm(prompt: str) -> str:
    return my_api.complete(prompt)

# 1. Gather all sources
sources = gather_all("./myapp")

# 2. Score and filter for this task
scored = score_relevance("Fix the auth bug", sources, my_llm)
relevant = filter_relevant(scored, threshold=0.5)

# 3. Infer outcome hierarchy
hierarchy = infer_hierarchy("Fix the auth bug", project_context, my_llm)

# 4. Synthesize primed context
primed = synthesize_context("Fix the auth bug", hierarchy, relevant, my_llm)

Claude Agent SDK

from context_prime.adapters.claude_sdk import run_primed_agent

await run_primed_agent(
    task="Fix the auth middleware bug",
    project_dir="./myapp",
    agent_model="claude-opus-4-6",
    priming_model="claude-sonnet-4-6",
)

Claude Code Hook

Drop this into .claude/settings.json to prime every session automatically:

{
  "hooks": {
    "SessionStart": [{
      "matcher": "*",
      "hooks": [{
        "type": "command",
        "command": "python -m context_prime.cli prime --project $CLAUDE_PROJECT_DIR --mode session --format hook",
        "timeout": 30
      }]
    }]
  }
}

Standalone Prototype

cd prototype
pip install -r requirements.txt

# Prime and execute
python prime_agent.py "Fix the auth middleware bug" --project /path/to/project --verbose

# Prime only (inspect the synthesized context)
python prime_agent.py "Add pagination" --project . --prime-only

How It Works

Context Priming runs a four-stage pipeline before the agent begins any task:

┌─────────────────────────────────────────────────────┐
│                  CONTEXT PRIMING                     │
│                                                      │
│  1. GATHER    Scan memories, codebase, git history,  │
│               configs, flagged priorities             │
│                                                      │
│  2. SCORE     LLM-based relevance scoring per task   │
│               Filter below threshold                  │
│                                                      │
│  3. FRAME     Infer outcome hierarchy                │
│               Final → Mid-term → Immediate            │
│                                                      │
│  4. SYNTHESIZE  Merge into compact primed context    │
│                 Goal-aware, task-specific              │
└─────────────────────────────────────────────────────┘
                         │
                         ▼
              Agent starts with optimal context

The core engine is model-agnostic. Every LLM call takes a callable(prompt) -> str. Adapters handle platform-specific context injection for Claude Code, Claude Agent SDK, and any Chat Completions API.

Outcome Hierarchy

Agents don't just see the task. They understand what it serves:

Final Outcome:     Ship the v2 platform by Q2
                        │
Mid-term Goal:     Complete the database migration
                        │
Immediate Task:    Fix the failing migration test

This prevents agents from making locally correct but globally wrong decisions.

Soft Compaction

We call this soft compaction -- constructing what should be in the window, rather than compressing what's already there:

Hard Compaction Soft Compaction
When Context window full Before task starts
How Summarize/truncate Synthesize from all sources
Risk Loses important details None (additive)
Result Degraded context Optimal context

Features

Property Auto-compact RAG MEMORY.md Context Priming
Proactive No Partial No Yes
Task-specific No Partial No Yes
Multi-source No No No Yes
Goal-aware No No No Yes
Cold-start capable No Partial No Yes

Architecture

context-prime/
├── context_prime/          # pip-installable library
│   ├── core/               # Model-agnostic priming engine
│   │   ├── gather.py       # Source gathering (memories, code, git, config)
│   │   ├── score.py        # LLM relevance scoring per task
│   │   ├── hierarchy.py    # Outcome hierarchy inference
│   │   └── synthesize.py   # Context synthesis
│   ├── adapters/           # Platform integrations
│   │   ├── claude_sdk.py   # Claude Agent SDK
│   │   ├── claude_hook.sh  # Claude Code hooks
│   │   └── raw_api.py      # Any Chat Completions API
│   └── cli.py              # CLI entry point
├── prototype/              # Standalone demo
├── whitepaper/             # Research paper
└── pyproject.toml

Whitepaper

The full research paper covers literature survey, architecture proposal, platform analysis, and prototype results: whitepaper/context-priming-whitepaper.md

Related Work

Contributing

Contributions are welcome. See CONTRIBUTING.md for guidelines.

License

MIT