Skip to content

EPIC(feat): Hybrid Query Router — metadata, graph, and semantic query strategies #109

@mfittko

Description

@mfittko

Summary

The current /query endpoint routes everything through semantic (vector) search. This works for natural language questions but is the wrong tool for structured lookups like:

  • "give me all Google Docs URLs I had open yesterday" → metadata filter, not embedding
  • "all related invoices for vendor OpenAI" → graph traversal, not cosine similarity

We need a routing layer that classifies query intent and dispatches to the right execution strategy.

Query Strategies

1. Metadata Filter (structured)

Direct Postgres WHERE on metadata fields — source URL patterns, timestamps, doc type, enrichment fields. No embedding needed.

Examples:

  • "all PDFs ingested last week"
  • "documents from docs.google.com"
  • "invoices with status enriched"

2. Graph Traversal (relational)

Start from a known entity, walk relationships in the knowledge graph. Return connected documents, not similarity-ranked ones.

Examples:

  • "all invoices related to vendor OpenAI"
  • "documents connected to entity X"
  • "what else references this contract"

3. Semantic Search (current behavior)

Embed query → pgvector cosine similarity → ranked results. No changes needed — this is what exists today.

4. Hybrid (combined)

Metadata filter to narrow the result set, then semantic ranking within it. Or graph traversal → re-rank by relevance.

Examples:

  • "summarize all OpenAI invoices from last month"

Router Design

POST /query
  { "text": "give me all google docs urls from yesterday" }
       │
       ▼
  ┌──────────┐
  │  Router   │  ← classifies intent
  └─────┬─────┘
        │
  ┌─────┴──────────────────┐
  │  Strategy Resolution    │
  │                         │
  │  1. Rule-based pass     │  ← fast, catches obvious patterns
  │  2. LLM fallback       │  ← for ambiguous queries
  │  3. Extract filters     │  ← time ranges, field names, entities
  │  4. Pick execution      │  ← metadata / graph / vector / hybrid
  └────────────┬────────────┘
               │
       ┌───────┼───────┐
       ▼       ▼       ▼
    metadata  graph  vector
    filter   traverse search
       └───────┼───────┘
               ▼
         merge & rank

Tiered classification (recommended approach)

  1. Fast rule pass — regex/keyword detection for time expressions, URL patterns, "related to", "connected to", entity names
  2. LLM fallback — only when rules are ambiguous; fast model classifies strategy + extracts structured filters
  3. This gives near-zero latency for common queries with smart fallback for edge cases

Prerequisites

Metadata query engine

  • Queryable index on source URL patterns, ingest timestamps, collection, doc type
  • Time expression parser ("yesterday", "last week", "in January")
  • Filter DSL or structured filter object

Graph query engine

  • Extend /graph/entity/:name to return connected documents (not just entity details)
  • Multi-hop traversal with configurable depth
  • Relationship-type filtering ("vendor", "references", "part-of")

Unified response format

All strategies return the same shape:

{
  "results": [...],
  "strategy": "metadata|graph|semantic|hybrid",
  "filters_applied": {...}
}

Suggested Build Order

  1. Metadata filter endpointPOST /query gains optional filters (source pattern, time range, doc type, enrichment fields). Pure SQL.
  2. Graph traversal endpoint — extend graph API to return connected documents with hop depth + relationship filters.
  3. Router (rule-based) — classify queries, extract filters, dispatch.
  4. LLM fallback — fast model call for ambiguous queries.
  5. Hybrid execution — filter → semantic re-rank for complex queries.

Open Questions


EPIC Execution Plan (Assignment + Delivery)

This EPIC is now split into child implementation issues and RFC issues. Use the plan below for assignment and sequencing.

Child Issues

Implementation

RFCs

Suggested Assignment Matrix

Phase Gates

Definition of Done (EPIC)


Cross-Cutting Concerns (from refinement)

These concerns span multiple child issues and must be coordinated during implementation.

CC-1: Schema Validation Conflict (affects #111, #112, #116)

The current querySchema requires query with pattern: "\\S" and marks it required. This conflicts with metadata-only queries (empty query + filter). Resolution: #112 must relax the schema and add a preValidation hook requiring either non-empty query OR filter with conditions.

CC-2: Migration Ordering (affects #113, #116)

Rule: #113 merges first, then #116. Both register in the sequential migrations array.

CC-3: LLM Endpoint Separation (affects #112)

ROUTER_LLM_MODEL must NOT default from EMBED_MODEL (embedding model ≠ generative model). Router must use Ollama /api/generate or OpenAI chat completions with a generative model.

CC-4: Graph CTE Guardrails (affects #113, #114)

Current recursive CTE lacks timeout, depth-aware dedup, entity cap, and cycle detection. SqlGraphBackend consolidation must ADD these guardrails, not just relocate existing SQL.

CC-5: QueryResult Interface Merge Order (affects #112, #113, #117)

Three issues modify QueryResult. Merge order: #113 (graph type) → #112 (routing field) → #117 (pin canonical shape).

CC-6: CLI Type Duplication (affects #116, #117)

cli/src/lib/types.ts duplicates QueryResult. Both #116 and #117 must update it. CLI --since/--until owned by #116; --strategy/--verbose owned by #117.


Mermaid — Dependency Graph

flowchart TD
    A[EPIC #109]

    R1[RFC #111 Router Contract]
    R2[RFC #115 Metadata DSL + Time Semantics]
    R3[RFC #114 Graph Traversal Contract]

    I1[#116 Metadata Strategy Engine]
    I2[#113 Graph Strategy Engine]
    I3[#112 Router + Classifier]
    I4[#110 Hybrid Execution + Rerank]
    I5[#117 Unified Response + CLI/Docs]

    A --> R1
    A --> R2
    A --> R3

    R1 --> I3
    R2 --> I1
    R3 --> I2

    I1 --> I3
    I2 --> I3

    I3 --> I4
    I1 --> I4
    I2 --> I4

    I4 --> I5
    I3 --> I5

    X[(#106 AGE follow-up)] -. soft dependency .-> I2

    style I2 fill:#f9f,stroke:#333
    style I1 fill:#f9f,stroke:#333
Loading

Pink nodes (#113, #116) form Phase 1 and can be developed in parallel, but #113 must merge first.

Mermaid — Delivery Sequence (Actual)

gantt
    title EPIC #109 Delivery (actual)
    dateFormat  YYYY-MM-DD HH:mm
    axisFormat  %m/%d %H:%M

    section RFC
    #111 Router contract           :done, r1, 2026-02-25 17:23, 2026-02-25 19:04
    #115 Metadata DSL/time         :done, r2, 2026-02-25 17:23, 2026-02-25 19:04
    #114 Graph contract            :done, r3, 2026-02-25 17:23, 2026-02-25 19:04

    section Core Implementation
    #113 Graph strategy engine     :done, i2, 2026-02-25 17:23, 2026-02-25 20:33
    #116 Metadata engine           :done, i1, 2026-02-25 17:23, 2026-02-25 21:14

    section Orchestration
    #112 Router + classifier       :done, i3, 2026-02-25 17:23, 2026-02-25 22:45
    #110 Hybrid rerank             :done, i4, 2026-02-25 17:23, 2026-02-26 00:21

    section Contract + Rollout
    #117 Unified response + docs   :done, i5, 2026-02-25 17:23, 2026-02-26 08:25

    section Follow-up
    #123 Docs alignment            :active, f1, 2026-02-26 09:23, 1d
    #124 Test coverage gap         :active, f2, 2026-02-26 09:26, 1d
Loading

All closed issues use actual created_atclosed_at timestamps. Open follow-up issues (#123, #124) shown as active.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions