Skip to content

amanning3390/flowstate-qmd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

388 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlowState-QMD

The fastest, most trustworthy memory layer for coding agents.

FlowState-QMD turns local markdown knowledge into shared project memory for Codex-, Claude-, Cursor-, and MCP-style agents. It combines a durable markdown memory store with a FlowState anticipatory cache so agents can pull relevant context before they fall into a reactive search loop.

License: MIT Base: QMD Hackathon: Hermes 2026

FlowState-QMD architecture

Why This Wins

  • Anticipatory memory, not just stored memory. FlowState prefetches context into ~/.cache/qmd/intuition.json so agents can start from the right documents instead of deciding to search after the fact.
  • Built for coding agents. It works best on docs, ADRs, RFCs, notes, runbooks, changelogs, migration logs, and benchmark writeups.
  • Local-first and inspectable. Everything runs on your machine with SQLite, sqlite-vec, and local GGUF models via node-llama-cpp.
  • MCP-native. The default experience is a clean MCP server plus a packaged agent skill.
  • Trust over magic. Results keep file paths, doc IDs, snippets, and explain traces so the agent can show its work.

Memory Model

  • Durable knowledge: indexed markdown files in named collections
  • Working memory: the FlowState anticipatory cache at ~/.cache/qmd/intuition.json
  • Context overlays: human-authored collection/path summaries added with qmd context add

90-Second Quickstart

git clone https://github.com/amanning3390/flowstate-qmd.git
cd flowstate-qmd
bun install

# Verify the host, recommend a profile, and emit wrapper configs
qmd init --target all

# Inspect readiness any time
qmd doctor

# Index your repo memory
qmd collection add ./docs --name docs
qmd collection add ./notes --name notes
qmd embed

# Start the coding-agent memory server
qmd mcp

If you want FlowState anticipatory memory too:

# Watch the current agent session log and keep the anticipatory cache warm
qmd flow ~/.codex/sessions/current.log --lite

# Compatibility alias also supported:
qmd flow start --watch ~/.codex/sessions/current.log --lite

Demo Path

The best live demo for judges or teammates:

  1. Index a repo's docs/, notes/, CHANGELOG.md, and ADRs.
  2. Start qmd flow on an active coding-agent session log.
  3. Ask a question like:
    • "Why did we roll back the auth migration?"
    • "What changed after the database incident?"
    • "What did we decide about the API contract in the ADR?"
  4. Have the agent call fetch_anticipatory_context before a regular search.
  5. Show that the returned memories already include the ADR, changelog, or meeting note the agent needed.

See docs/DEMO.md for an exact script.

What Makes FlowState Different

Traditional RAG for agents looks like this:

user asks → agent decides to search → tool call → wait → result processing → answer

FlowState-QMD changes the loop:

user asks → anticipatory context already exists → agent answers or deepens with query/get

That difference matters most in coding workflows where the missing context is often already in:

  • design docs
  • changelogs
  • migration notes
  • incident writeups
  • RFCs / ADRs
  • benchmark reports

Core Workflow

1. Index project memory

qmd collection add . --name repo-memory --mask '**/*.md'

# Compatibility alias
qmd index . --name repo-memory --mask '**/*.md'

2. Add context overlays

qmd context add qmd://repo-memory/docs "Project documentation and architecture notes"
qmd context add qmd://repo-memory/adr "Architecture decision records and tradeoff history"
qmd context add / "Answer as a coding agent using the repo's documented decisions"

3. Generate embeddings

qmd embed

4. Query project memory

# Auto-expand + rerank
qmd query "why was the rollout reverted"

# Structured search for better control
qmd query $'lex: "auth rollback" migration\nvec: why did we revert the auth migration?'

# Inspect why memories were chosen
qmd query --json --explain "performance regression in auth service"

5. Fetch exact evidence

qmd get "#abc123"
qmd multi-get "docs/**/*.md,CHANGELOG.md"

6. Run as MCP memory for agents

qmd mcp

The flagship tool order for coding agents is:

  1. fetch_anticipatory_context
  2. query
  3. get / multi_get
  4. status

MCP Setup

Claude Code / Codex-style local agents

{
  "mcpServers": {
    "qmd": { "command": "qmd", "args": ["mcp"] }
  }
}

One-command wrapper bootstrap

qmd init --target hermes
qmd init --target gemini,kiro,vscode
qmd doctor --json

qmd init now:

  • profiles the host and recommends standard or lite
  • writes a bootstrap report to ~/.cache/qmd/bootstrap-report.json
  • emits or installs config for Hermes, Claude Code, Codex, Gemini CLI, Kiro, VS Code, OpenClaw, and pi
  • keeps the canonical qmd MCP namespace and tool order across every client

Wrapper targets

  • hermes: installs ~/.hermes/config.yaml
  • claude-code: installs ~/.claude.json
  • codex: installs ~/.codex/config.toml
  • gemini: installs ~/.gemini/settings.json
  • kiro: installs .kiro/settings/mcp.json
  • vscode: installs .vscode/mcp.json
  • openclaw, pi: emits ready-to-apply artifacts under ~/.cache/qmd/targets/

Install the packaged skill

qmd skill install --global --yes

The packaged skill is optimized for coding-agent memory and teaches the agent to:

  • use fetch_anticipatory_context first
  • use query for deeper retrieval
  • use get / multi_get for exact evidence

Anticipatory Memory Tool

fetch_anticipatory_context prefers the FlowState cache and falls back to a live project-memory query when needed.

Example payload:

{
  "recent_conversation": "What changed in the auth rollback plan and why did we revert the migration?",
  "refresh": false,
  "lite_mode": false
}

Typical uses:

  • current coding question with recent local context
  • migration/debugging follow-up questions
  • ADR/changelog lookups where the agent should feel preloaded

CLI Highlights

qmd status
qmd collection list
qmd collection update-cmd docs 'git pull --rebase --ff-only'
qmd collection exclude archive
qmd ls repo-memory/docs
qmd cleanup

Hardware Profiles

Standard

  • Qwen3 embedding + reranker models
  • best quality
  • recommended for Apple Silicon with 16GB+ RAM or comparable Linux hardware

Lite

  • Models: Qwen3-Embedding-0.6B + Qwen3-Reranker-0.6B (~1.2GB VRAM total)
  • Use case: Laptops, CI environments, or demo setups with <8GB RAM
  • Behavior: hybridQuery() caps results at 5, limits candidates to 15, and skips LLM reranking
  • Auto-engaged: Systems with <8GB RAM automatically use Lite mode

Use:

qmd flow ~/.codex/sessions/current.log --lite

The --lite flag can also be passed to qmd query and qmd embed for consistent low-memory operation.

Known Limits

  • The best results come from markdown knowledge, not arbitrary binary project assets.
  • Anticipatory memory depends on having an active session log to watch.
  • First model-backed search can be slower while local models warm up.
  • qmd collection add, qmd embed, and qmd update are intentionally manual operations; they are not auto-run by this repo's agents.

Development

Use Bun in this repo.

bun install
npx vitest run --reporter=verbose test/
bun test --preload ./src/test-preload.ts test/

Important commands:

bun src/cli/qmd.ts <command>
qmd skill show
qmd mcp --http --daemon
qmd mcp stop

Multi-Agent Idempotency

When multiple agents index the same project, FlowState-QMD prevents duplicate memories:

# Agent A indexes a doc about auth rollback
# Agent B tries to index a semantically identical doc
# → QMD detects 0.94 cosine similarity (threshold: 0.90)
# → Annotates existing memory instead of duplicating

Dedup stats are tracked via getDedupStats() and visible in qmd status.

Cache Telemetry

FlowState tracks cache hit/miss rates and refresh latency in ~/.cache/qmd/telemetry.json, surfaced via qmd status and the MCP status tool.

Benchmarks

Reproducible benchmarks in bench/:

bun bench/latency.ts       # FTS vs hybrid vs hybrid-lite search latency
bun bench/cache-hit.ts     # Intuition cache read/parse latency (1000 rounds)
bun bench/tool-calls.ts    # MCP tool round-trip simulation

All output JSON for programmatic consumption.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     Agent (Hermes, Claude Code, Codex, ...) │
│                              │                              │
│                    MCP tool calls                           │
│                              ▼                              │
│  ┌─────────────────────────────────────────────────────┐    │
│  │              MCP Server (stdio / HTTP)               │    │
│  │  fetch_anticipatory_context │ query │ get │ status   │    │
│  └──────────────┬──────────────┴───────┴─────┴──────┘    │
│                 │                        │               │
│     ┌───────────▼──────────┐   ┌────────▼────────┐      │
│     │   FlowState Engine   │   │   Store (SQLite) │      │
│     │  fs.watch + debounce │   │  FTS5 + vec      │      │
│     │  intuition.json      │   │  BM25 + cosine   │      │
│     │  telemetry.json      │   │  RRF fusion      │      │
│     └──────────────────────┘   │  LLM reranking   │      │
│                                │  Idempotency     │      │
│                                └──────────────────┘      │
│                                         │                │
│                              ┌──────────▼──────────┐     │
│                              │   node-llama-cpp    │     │
│                              │  Qwen3 Embed + Rank │     │
│                              └─────────────────────┘     │
└─────────────────────────────────────────────────────────────┘
  • Store: SQLite + FTS5 + sqlite-vec
  • Retrieval: BM25 + vector search + reciprocal rank fusion + reranking
  • FlowState: event-driven watcher that keeps intuition.json fresh
  • Idempotency: cosine similarity dedup at 0.90 threshold on document ingest
  • Telemetry: cache hit/miss tracking persisted to telemetry.json
  • Interfaces: CLI, MCP server, SDK, packaged agent skill

Submission Assets

Why Ask When Your Agent Can Already Know?

FlowState-QMD is not trying to be every kind of memory system. It is trying to be the best coding-agent memory layer: fast, local, inspectable, shared across agents, and strong enough to make project context feel native.

About

FlowState QMD — Anticipatory Memory for AI Agents. Hermes 2026 Hackathon Entry.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors