You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement an AI-powered interactive chat system that allows users to select multiple media files from the gallery view and start a conversational AI session with those transcripts as context. The system should use Retrieval Augmented Generation (RAG) with OpenSearch to provide accurate, context-aware responses about the selected transcript content, mimicking the ChatGPT interface experience.
Problem Statement
Users often want to ask questions about their transcripts, extract specific information, or analyze content across multiple recordings. Currently, they must manually read through entire transcripts to find relevant information. An interactive AI chat system would allow users to:
Ask questions about specific topics across multiple transcripts
Get summaries of discussions on particular subjects
Find action items, decisions, or key points mentioned by specific speakers
Analyze trends and patterns across multiple meetings/recordings
Extract insights without manually searching through hours of content
❌ RAG System: No retrieval augmented generation implementation
❌ Chat Session Management: No backend chat session handling
Selection Infrastructure (Already Available)
// From MediaLibrary.svelteletselectedFiles=newSet<number>();// Complete multi-select with UI controls already implemented
Proposed Solution
User Experience Flow
Gallery Selection: User selects one or more media files using existing checkbox system
Chat Initiation: New "Start AI Chat" button appears when files are selected
Chat Session: Modal or full-page chat interface opens with ChatGPT-like experience
Context Loading: System loads selected transcripts as RAG context
Interactive Chat: User asks questions, AI responds with context-aware answers
Reference Links: Responses include links to specific transcript segments/timestamps
Technical Architecture
RAG System with OpenSearch
Following industry best practices for enterprise RAG implementation:
graph TD
A[User Query] --> B[Query Processing]
B --> C[OpenSearch Retrieval]
C --> D[Context Ranking]
D --> E[LLM + Context]
E --> F[Response Generation]
F --> G[Response with Citations]
<scriptlang="ts">
exportlet selectedFiles:MediaFile[];exportlet currentContext:ContextChunk[];// Display selected files as context// Show current relevant segments// Jump to transcript functionality
</script>
<\!-- Add to existing selection controls -->
<divclass="selection-controls">
<\!-- Existing buttons -->
{#ifselectedFiles.size>0}
<buttonclass="chat-btn"on:click={startChatSession}
title="Start AI chat with selected {selectedFiles.size} file{selectedFiles.size===1?'':'s'}"
>
<ChatIcon />
Start AI Chat ({selectedFiles.size})
</button>
{/if}
</div>
Database Schema
Chat Session Management (Optional - for history)
-- Optional: For persistent chat historyCREATETABLEchat_session (
id VARCHAR(255) PRIMARY KEY,
user_id INTEGERNOT NULLREFERENCES"user"(id) ON DELETE CASCADE,
title VARCHAR(255),
file_ids INTEGER[] NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
expires_at TIMESTAMP WITH TIME ZONE,
INDEX(user_id),
INDEX(created_at)
);
CREATETABLEchat_message (
id SERIALPRIMARY KEY,
session_id VARCHAR(255) NOT NULLREFERENCES chat_session(id) ON DELETE CASCADE,
role VARCHAR(20) NOT NULL, -- 'user' or 'assistant'
content TEXTNOT NULL,
context_used JSONB, -- Citations and context chunks used
created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
INDEX(session_id),
INDEX(created_at)
);
API Endpoints
Chat Session Management
POST /api/chat/sessions - Create new chat session with selected files
GET /api/chat/sessions/{session_id} - Get session details
CHAT_SYSTEM_PROMPT="""You are an AI assistant helping users analyze and understand their transcript content. You have access to transcript segments from the user's selected media files.IMPORTANT GUIDELINES:1. Base your responses on the provided transcript context2. Always cite specific speakers, timestamps, and files when referencing information3. If information isn't in the provided context, clearly state this4. Provide specific quotes when relevant5. Be conversational but accurate6. Suggest follow-up questions when appropriateCONTEXT FILES:{file_context}RELEVANT TRANSCRIPT SEGMENTS:{transcript_context}CONVERSATION HISTORY:{conversation_history}USER QUESTION: {user_query}Provide a helpful, accurate response based on the transcript content above."""
backend/app/services/opensearch_service.py - Extend for RAG functionality
database/init_db.sql - Add chat tables (if history enabled)
Priority
High Priority - This feature transforms the application from a passive transcription tool into an interactive AI assistant, significantly increasing user engagement and value proposition. It leverages existing infrastructure while providing a modern, ChatGPT-like experience that users expect from AI applications.
Dependencies
Infrastructure (Already Available)
OpenSearch cluster for transcript search
WebSocket infrastructure for real-time communication
Feature Summary
Implement an AI-powered interactive chat system that allows users to select multiple media files from the gallery view and start a conversational AI session with those transcripts as context. The system should use Retrieval Augmented Generation (RAG) with OpenSearch to provide accurate, context-aware responses about the selected transcript content, mimicking the ChatGPT interface experience.
Problem Statement
Users often want to ask questions about their transcripts, extract specific information, or analyze content across multiple recordings. Currently, they must manually read through entire transcripts to find relevant information. An interactive AI chat system would allow users to:
Current State Analysis
Existing Infrastructure
selectedFilesSet, checkboxes, batch operations)backend/app/api/websockets.py)Selection Infrastructure (Already Available)
Proposed Solution
User Experience Flow
Technical Architecture
RAG System with OpenSearch
Following industry best practices for enterprise RAG implementation:
graph TD A[User Query] --> B[Query Processing] B --> C[OpenSearch Retrieval] C --> D[Context Ranking] D --> E[LLM + Context] E --> F[Response Generation] F --> G[Response with Citations]OpenSearch RAG Implementation
{ "query": { "bool": { "must": [ {"terms": {"file_id": [123, 456, 789]}}, { "multi_match": { "query": "user question about budget planning", "fields": ["text^2", "speaker_name", "summary"], "type": "best_fields", "fuzziness": "AUTO" } } ] } }, "highlight": { "fields": { "text": {"fragment_size": 150, "number_of_fragments": 3} } }, "_source": ["text", "speaker_name", "start_time", "end_time", "file_id", "filename"], "size": 10, "sort": [{"_score": {"order": "desc"}}] }Hybrid Search Strategy
Technical Implementation
Phase 1: Core Chat Infrastructure
Chat Session Management
RAG Service Implementation
Chat API Endpoints
Phase 2: Frontend Chat Interface
ChatGPT-like UI Components
Gallery Integration
Phase 3: Advanced Features
Enhanced RAG Capabilities
Chat History (Optional)
Backend Architecture
1. Chat Session Service (
backend/app/services/chat_service.py)2. RAG Service (
backend/app/services/rag_service.py)3. Enhanced OpenSearch Service (
backend/app/services/opensearch_chat_service.py)4. WebSocket Chat Handler (
backend/app/api/chat_websocket.py)Frontend Architecture
1. Chat Interface (
frontend/src/components/ChatInterface.svelte)2. Chat Message Component (
frontend/src/components/ChatMessage.svelte)3. Context Panel (
frontend/src/components/ChatContextPanel.svelte)4. Enhanced Gallery Integration (
frontend/src/routes/MediaLibrary.svelte)Database Schema
Chat Session Management (Optional - for history)
API Endpoints
Chat Session Management
POST /api/chat/sessions- Create new chat session with selected filesGET /api/chat/sessions/{session_id}- Get session detailsDELETE /api/chat/sessions/{session_id}- Delete sessionGET /api/chat/sessions- List user's chat sessions (if history enabled)Chat Interaction
POST /api/chat/sessions/{session_id}/messages- Send message (alternative to WebSocket)GET /api/chat/sessions/{session_id}/messages- Get conversation historyWebSocket /ws/chat/{session_id}- Real-time chat communicationContext & Search
POST /api/chat/search- Search within session contextGET /api/chat/sessions/{session_id}/context- Get current context filesPOST /api/chat/sessions/{session_id}/regenerate- Regenerate last responseRAG Implementation Details
Context Retrieval Strategy
LLM Prompt Template
Citation Format
UI/UX Design
ChatGPT-like Interface Features
Message Threading
Interactive Elements
Real-time Features
Context Display
Modal vs Full-Page Design
Recommended: Modal Approach
Alternative: Full-Page
/chat/{session_id}Configuration
Environment Variables
Implementation Phases
Phase 1: Core Infrastructure (Sprint 1-2)
Acceptance Criteria:
Phase 2: Frontend Interface (Sprint 3)
Acceptance Criteria:
Phase 3: Enhanced RAG (Sprint 4)
Acceptance Criteria:
Phase 4: Advanced Features (Sprint 5)
Acceptance Criteria:
Testing Strategy
Unit Tests
Integration Tests
Performance Tests
User Experience Tests
Security Considerations
Session Security
Data Privacy
Input Validation
Success Metrics
Functionality
User Experience
Adoption
Future Enhancements
Advanced AI Features
Collaboration Features
Analytics & Insights
Files to Create/Modify
New Backend Files
backend/app/services/chat_service.pybackend/app/services/rag_service.pybackend/app/services/opensearch_chat_service.pybackend/app/api/chat_websocket.pybackend/app/api/endpoints/chat.pybackend/app/schemas/chat.pybackend/app/models/chat.py(if history enabled)New Frontend Files
frontend/src/components/ChatInterface.sveltefrontend/src/components/ChatMessage.sveltefrontend/src/components/ChatContextPanel.sveltefrontend/src/components/ChatModal.sveltefrontend/src/lib/types/chat.tsfrontend/src/lib/services/chatService.tsfrontend/src/stores/chat.tsModified Files
frontend/src/routes/MediaLibrary.svelte- Add chat button to selection controlsbackend/app/api/router.py- Include chat endpointsbackend/app/core/config.py- Add chat configurationbackend/app/services/opensearch_service.py- Extend for RAG functionalitydatabase/init_db.sql- Add chat tables (if history enabled)Priority
High Priority - This feature transforms the application from a passive transcription tool into an interactive AI assistant, significantly increasing user engagement and value proposition. It leverages existing infrastructure while providing a modern, ChatGPT-like experience that users expect from AI applications.
Dependencies
Infrastructure (Already Available)
External Services
Performance Requirements
Labels
enhancement,ai-integration,chat,rag,opensearch,high-priority,backend,frontend,websocket,user-experienceReporter: Claude Code Assistant
Epic: AI-Powered Interactive Features
Component: Chat & RAG System
Estimated Effort: 5 sprints (25-30 story points)
Related Issues: #51 (LLM Integration), #29 (OpenSearch Enhancement)