Skip to content

Implement Local Code Search Using Tantivy #253

@mstfash

Description

@mstfash

Description

Add local full-text search using Tantivy to provide fast, offline text search for infrastructure code. This complements the existing semantic code search with traditional full-text search capabilities.

Current State

The local_code_search tool uses API-based semantic search for infrastructure files: .tf, .yaml, .yml, Dockerfile
Reference: libs/shared/src/utils.rs lines 110-126

Proposed Solution

Add Tantivy for fast, local-only text search that works without API calls or network connectivity.

Core Requirements (Infrastructure Files)

  • Add Tantivy dependency to workspace
  • Create indexing for infrastructure file types:
    • Terraform (.tf, .tfvars)
    • Kubernetes/YAML (.yaml, .yml)
    • Docker (Dockerfile, .dockerfile)
    • GitHub Actions (.github/workflows/*.yml)
  • Index files on startup (similar to existing code index)
  • Add new MCP tool tantivy_search for text queries
  • Store index in ~/.stakpak/tantivy_index/
  • Respect .gitignore patterns

Additional Enhancement (Optional)

  • Support indexing common programming languages (.js, .ts, .py, .rs, .go, .java)
  • Add configuration to enable/disable additional file types

Technical Details

  • New module: cli/src/tantivy_index.rs
  • MCP tool location: libs/mcp/server/src/local_tools.rs
  • Reuse existing file walker logic from cli/src/code_index.rs
  • Index storage: ~/.stakpak/tantivy_index/

Basic Search Features

  • Simple text search with BM25 ranking
  • Search across file content
  • Return file path and matched snippets

Resources

Success Criteria

  • Fast text search for infrastructure files (<200ms on typical projects)
  • Works offline without API key
  • Complements existing semantic search
  • Basic documentation with examples

Benefits

  • Offline: No API required for basic text search
  • Fast: Local Rust-based search engine
  • Privacy: All data stays local
  • Complementary: Use alongside semantic search for different query types

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthacktoberfestHacktoberfest contributions welcome

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions