-
Notifications
You must be signed in to change notification settings - Fork 0
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Overview
Major overhaul of the RAG pipeline: query understanding, hybrid search, LLM reranking, redesigned tools, richer context assembly, conversation memory, and observability.
Tasks
- Query analysis pipeline: Build
analyzeQuery()— single Gemini Flash call extracting intent, entities, temporal resolution, query rewrites. Replaceget_current_datetool. Add entity pre-resolution. - Shared embedding + hybrid search: Generate query embedding once per request. Add
tsvector+ GIN indexes on transcript_segments, motions, agenda_items, matters, documents. Createhybrid_searchRPC with reciprocal rank fusion (RRF). - LLM reranking: Batch top-30 candidates into a single Gemini Flash call, score 0-10. Only trigger for complex queries with >10 candidates.
- Redesigned tool set:
search_discussions(replaces search_transcript_segments + search_agenda_items),search_decisions(replaces search_motions),get_person_activity(merges statements + voting),search_documents(new),get_meeting_context(new),get_timeline(new). - Rich context assembly: Replace
truncateForContext()with structured text formatting. Build full evidence bundles for synthesizer with quotes, vote breakdowns, document excerpts. - Orchestrator improvements: Adaptive step budget by query intent. Self-correction (retry with broader query on sparse results). Parallel tool execution via
Promise.all. - Answer quality: Confidence scoring. Claim verification pass. Confidence indicator in UI.
- Conversation memory: Store state in KV (session_id → turns with evidence + entities). Pronoun resolution. Evidence reuse for follow-ups. Suggested follow-up questions.
- Observability: Per-question telemetry. Admin quality dashboard. Thumbs up/down feedback buttons.
- Search UI: Unified search bar with auto-routing. Suggested follow-ups as clickable chips. Enhanced citation hover cards.
- Fix
search_agenda_itemsinjection vulnerability: Sanitize query in PostgRESTilikefilter. - Create missing RPC functions:
match_transcript_segments,match_motions,match_matters,match_agenda_items.
Current Problems
- Redundant embedding generation (same embedding generated 3x per question)
- No query rewriting — raw user text passed to vector search
- Truncation destroys context (15 items, 2000 chars cap)
- No hybrid search — vector-only misses exact matches
- No reranking — HNSW scores are approximate
- Single-shot orchestration with no self-correction
- No document search capability
- Thin synthesizer context (120-char titles, not evidence text)
Tool Redesign
| New Tool | Replaces | Purpose |
|---|---|---|
| search_discussions | search_transcript_segments + search_agenda_items | What was discussed |
| search_decisions | search_motions | What was decided |
| get_person_activity | get_statements_by_person + get_voting_history | Everything about a person |
| search_documents | (new) | Staff reports, attachments |
| get_meeting_context | (new) | Full meeting detail |
| get_timeline | (new) | Chronological matter history |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request