Skip to content

AndySze/session-seek

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

session-seek

Search your AI coding agent session history with semantic + keyword search, served as an MCP server.

Supports Claude Code and OpenAI Codex CLI sessions.

Why

AI coding agents store every conversation in JSONL files, but provide no way to search across past sessions. session-seek fills that gap with two complementary search modes:

  • search_sessions — Semantic vector search. Find discussions by meaning, not exact words. Great for "how did we solve that auth problem?" or "what was the architectural decision about caching?"
  • grep_sessions — Exact keyword/regex search powered by ripgrep. Find specific words, error messages, or variable names that vector search misses.

Plus utilities:

  • get_session_context — Expand context around a search result
  • index_sessions — Build/update the vector index (incremental, line-level)
  • session_stats — Database statistics

Features

  • Incremental line-level indexing — only processes new lines since last run
  • Auto-indexing via Claude Code Stop hook — index triggers every time Claude finishes responding
  • Content type classification (discussion / code / error / decision / tool)
  • Session grouping for broad queries
  • Reranking for higher quality semantic results
  • Time filtering, project filtering, role filtering
  • File lock to prevent concurrent indexing

Prerequisites

  • Bun runtime
  • ripgrep (brew install ripgrep / apt install ripgrep)
  • An embedding API key (optional — only needed for semantic search, not for grep_sessions). Works with any OpenAI-compatible embedding API.

Installation

git clone https://github.com/AndySze/session-seek.git
cd session-seek
bun install

Configuration

1. Add MCP server to Claude Code

Quick setup (one command):

# Without API key (grep_sessions only):
claude mcp add -s user session-seek -- bun run /path/to/session-seek/index.ts

# With API key (full semantic search):
claude mcp add -s user -e OPENAI_API_KEY=sk-... session-seek -- bun run /path/to/session-seek/index.ts

Defaults to OpenAI text-embedding-3-small. Any OpenAI-compatible embedding API works — see Environment variables to customize.

Or manually edit ~/.claude.json
{
  "mcpServers": {
    "session-seek": {
      "command": "bun",
      "args": ["run", "/path/to/session-seek/index.ts"],
      "env": {
        "SESSION_SEEK_API_KEY": "your-api-key-here"
      }
    }
  }
}

2. Add MCP server to Codex CLI

# Without API key (grep_sessions only):
codex mcp add session-seek -- bun run /path/to/session-seek/index.ts

# With API key (full semantic search):
codex mcp add session-seek --env SESSION_SEEK_API_KEY=your-key -- bun run /path/to/session-seek/index.ts

3. (Optional) Auto-index on Stop hook

Add to ~/.claude/settings.json under hooks.Stop:

{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "bun run /path/to/session-seek/index-hook.ts",
            "timeout": 120,
            "async": true
          }
        ]
      }
    ]
  }
}

This indexes the current project's sessions every time Claude finishes responding.

4. Environment variables

Embedding API (any OpenAI-compatible endpoint):

Variable Default Description
SESSION_SEEK_API_KEY API key for embeddings/reranking. Falls back to OPENAI_API_KEY. Only needed for search_sessions.
SESSION_SEEK_EMBED_API_URL https://api.openai.com/v1/embeddings Embedding API endpoint (any OpenAI-compatible API)
SESSION_SEEK_EMBED_MODEL text-embedding-3-small Embedding model name
SESSION_SEEK_EMBED_DIM 1024 Embedding dimensions (must match model output)
SESSION_SEEK_RERANK_API_URL (none) Reranking API endpoint (optional, falls back to vector distance)
SESSION_SEEK_RERANK_MODEL (none) Reranking model name

Paths:

Variable Default Description
SESSION_SEEK_DB_PATH ~/.claude/session-seek-db LanceDB storage directory
SESSION_SEEK_CLAUDE_ROOT ~/.claude/projects Claude Code sessions directory
SESSION_SEEK_CODEX_ROOT ~/.codex Codex CLI home directory

When using a custom embedding provider or multiple env vars, edit ~/.claude.json directly:

{
  "mcpServers": {
    "session-seek": {
      "command": "bun",
      "args": ["run", "/path/to/session-seek/index.ts"],
      "env": {
        "SESSION_SEEK_API_KEY": "your-key",
        "SESSION_SEEK_EMBED_API_URL": "https://your-provider.com/v1/embeddings",
        "SESSION_SEEK_EMBED_MODEL": "your-model-name",
        "SESSION_SEEK_RERANK_API_URL": "https://your-provider.com/v1/rerank",
        "SESSION_SEEK_RERANK_MODEL": "your-rerank-model"
      }
    }
  }
}

Any provider with an OpenAI-compatible /v1/embeddings endpoint works. Reranking is optional — if not configured, search falls back to vector distance ranking.

Usage

Once configured, the tools are available in any Claude Code session. Claude will use them automatically when you reference past conversations:

  • "How did we solve that auth issue?" → search_sessions
  • "Find where we mentioned REDIS_URL" → grep_sessions
  • "What was the build error last week?" → search_sessions with content_type: "error"

search_sessions

Semantic search with vector similarity + reranking.

Parameters:

  • query (required) — search text
  • project — filter by project name (default: current project, * for all)
  • limit — max results (default: 10)
  • role — filter by user or assistant
  • content_type — filter by discussion, code, error, decision, tool
  • time_after / time_before — ISO date filtering
  • group_by_session — group results by session
  • context_chunks — include surrounding messages (0-5)

grep_sessions

Exact keyword/regex search via ripgrep. No indexing required.

Parameters:

  • pattern (required) — keyword or regex
  • project — filter by project name (default: current project, * for all)
  • regex — treat pattern as regex (default: false)
  • case_sensitive — (default: false)
  • role — filter by user or assistant
  • limit — max results (default: 20)
  • context_lines — surrounding messages per match (default: 1)

Architecture

~/.claude/projects/**/*.jsonl           ← Claude Code sessions
~/.codex/sessions/YYYY/MM/DD/*.jsonl    ← Codex CLI sessions
        │
        ├──► providers/claude.ts        ← Claude Code JSONL parser
        ├──► providers/codex.ts         ← Codex CLI rollout parser
        │         │
        │    parser.ts (unified)
        │         │
        ├──► index-hook.ts              ← Stop hook: incremental indexing
        │         │
        │         ▼
        │     LanceDB (vectors)         ← search_sessions queries this
        │
        └──► ripgrep                    ← grep_sessions searches files directly
  • Providers: Pluggable architecture — add new agents by implementing SessionProvider
  • Embedding: Any OpenAI-compatible API (default: OpenAI text-embedding-3-small, 1024 dimensions)
  • Reranking: Optional (configurable, falls back to vector distance)
  • Vector DB: LanceDB (local, no server needed)
  • Keyword search: ripgrep (fast, parallel, regex-capable)

License

MIT

About

Search your AI coding agent session history with semantic + keyword search via MCP. Supports Claude Code and Codex CLI.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors