Feature: AI Interactive Chat with RAG System for Selected Transcripts

## Feature Summary

Implement an AI-powered interactive chat system that allows users to select multiple media files from the gallery view and start a conversational AI session with those transcripts as context. The system should use Retrieval Augmented Generation (RAG) with OpenSearch to provide accurate, context-aware responses about the selected transcript content, mimicking the ChatGPT interface experience.

## Problem Statement

Users often want to ask questions about their transcripts, extract specific information, or analyze content across multiple recordings. Currently, they must manually read through entire transcripts to find relevant information. An interactive AI chat system would allow users to:

- Ask questions about specific topics across multiple transcripts
- Get summaries of discussions on particular subjects
- Find action items, decisions, or key points mentioned by specific speakers
- Analyze trends and patterns across multiple meetings/recordings
- Extract insights without manually searching through hours of content

## Current State Analysis

### Existing Infrastructure
- ✅ **Selection System**: MediaLibrary.svelte already has complete multi-select functionality (`selectedFiles` Set, checkboxes, batch operations)
- ✅ **OpenSearch**: Full-text search infrastructure for transcripts exists
- ✅ **WebSocket**: Real-time communication infrastructure (`backend/app/api/websockets.py`)
- ✅ **LLM Integration**: Will leverage same multi-provider system from Issue #51
- ❌ **Chat Interface**: No existing chat UI components
- ❌ **RAG System**: No retrieval augmented generation implementation
- ❌ **Chat Session Management**: No backend chat session handling

### Selection Infrastructure (Already Available)
```javascript
// From MediaLibrary.svelte
let selectedFiles = new Set<number>();
// Complete multi-select with UI controls already implemented
```

## Proposed Solution

### User Experience Flow
1. **Gallery Selection**: User selects one or more media files using existing checkbox system
2. **Chat Initiation**: New "Start AI Chat" button appears when files are selected
3. **Chat Session**: Modal or full-page chat interface opens with ChatGPT-like experience
4. **Context Loading**: System loads selected transcripts as RAG context
5. **Interactive Chat**: User asks questions, AI responds with context-aware answers
6. **Reference Links**: Responses include links to specific transcript segments/timestamps

### Technical Architecture

#### **RAG System with OpenSearch**
Following industry best practices for enterprise RAG implementation:

```mermaid
graph TD
    A[User Query] --> B[Query Processing]
    B --> C[OpenSearch Retrieval]
    C --> D[Context Ranking]
    D --> E[LLM + Context]
    E --> F[Response Generation]
    F --> G[Response with Citations]
```

#### **OpenSearch RAG Implementation**
```json
{
  "query": {
    "bool": {
      "must": [
        {"terms": {"file_id": [123, 456, 789]}},
        {
          "multi_match": {
            "query": "user question about budget planning",
            "fields": ["text^2", "speaker_name", "summary"],
            "type": "best_fields",
            "fuzziness": "AUTO"
          }
        }
      ]
    }
  },
  "highlight": {
    "fields": {
      "text": {"fragment_size": 150, "number_of_fragments": 3}
    }
  },
  "_source": ["text", "speaker_name", "start_time", "end_time", "file_id", "filename"],
  "size": 10,
  "sort": [{"_score": {"order": "desc"}}]
}
```

#### **Hybrid Search Strategy**
1. **Semantic Search**: Vector embeddings for conceptual matches
2. **Keyword Search**: BM25 for exact term matching
3. **Contextual Filtering**: File-specific and speaker-specific results
4. **Temporal Awareness**: Time-based context understanding

## Technical Implementation

### Phase 1: Core Chat Infrastructure
1. **Chat Session Management**
   - Session creation with selected file context
   - Conversation state management
   - Session cleanup and timeout handling

2. **RAG Service Implementation**
   - OpenSearch retrieval with context ranking
   - Chunk management and relevance scoring
   - Citation tracking for response attribution

3. **Chat API Endpoints**
   - Session creation and management
   - Message processing with RAG
   - Real-time streaming responses

### Phase 2: Frontend Chat Interface
4. **ChatGPT-like UI Components**
   - Message thread display
   - Real-time typing indicators
   - Copy/paste functionality
   - Message actions (copy, regenerate)

5. **Gallery Integration**
   - Enhanced selection controls
   - Chat initiation button
   - Context file display

### Phase 3: Advanced Features
6. **Enhanced RAG Capabilities**
   - Multi-turn conversation context
   - Cross-reference detection
   - Temporal query understanding

7. **Chat History (Optional)**
   - Session persistence
   - Chat search and retrieval
   - Export functionality

## Backend Architecture

### 1. Chat Session Service (`backend/app/services/chat_service.py`)
```python
class ChatSession:
    def __init__(self, session_id: str, user_id: int, file_ids: List[int]):
        self.session_id = session_id
        self.user_id = user_id
        self.file_ids = file_ids
        self.conversation_history = []
        self.context_cache = {}
        
    async def process_message(self, message: str) -> ChatResponse:
        # 1. Retrieve relevant context from OpenSearch
        # 2. Combine with conversation history
        # 3. Generate LLM response
        # 4. Return with citations
```

### 2. RAG Service (`backend/app/services/rag_service.py`)
```python
class RAGService:
    async def retrieve_context(self, query: str, file_ids: List[int], limit: int = 10) -> List[ContextChunk]:
        # Hybrid search: semantic + keyword
        # Relevance scoring and ranking
        # Context window management
        
    async def generate_response(self, query: str, context: List[ContextChunk], history: List[ChatMessage]) -> str:
        # LLM integration with context injection
        # Citation generation
        # Response streaming
```

### 3. Enhanced OpenSearch Service (`backend/app/services/opensearch_chat_service.py`)
```python
class OpenSearchChatService:
    async def hybrid_search(self, query: str, file_ids: List[int]) -> SearchResults:
        # Combine semantic and keyword search
        # File-specific filtering
        # Relevance scoring
        
    async def get_context_window(self, segment_id: str, window_size: int = 3) -> List[Segment]:
        # Retrieve surrounding context for better understanding
        # Speaker continuity
        # Temporal context
```

### 4. WebSocket Chat Handler (`backend/app/api/chat_websocket.py`)
```python
@router.websocket("/ws/chat/{session_id}")
async def chat_websocket(websocket: WebSocket, session_id: str, current_user: User):
    # Real-time chat communication
    # Streaming response delivery
    # Connection management
```

## Frontend Architecture

### 1. Chat Interface (`frontend/src/components/ChatInterface.svelte`)
```svelte
<script lang="ts">
  export let sessionId: string;
  export let selectedFiles: number[];
  
  let messages: ChatMessage[] = [];
  let currentMessage = "";
  let isLoading = false;
  let wsConnection: WebSocket;
  
  // ChatGPT-like interface
  // Real-time message streaming
  // Copy/paste functionality
  // Message actions
</script>
```

### 2. Chat Message Component (`frontend/src/components/ChatMessage.svelte`)
```svelte
<script lang="ts">
  export let message: ChatMessage;
  export let showCitations: boolean = true;
  
  // Message rendering with markdown
  // Citation links to transcript segments
  // Copy functionality
  // Regenerate option
</script>
```

### 3. Context Panel (`frontend/src/components/ChatContextPanel.svelte`)
```svelte
<script lang="ts">
  export let selectedFiles: MediaFile[];
  export let currentContext: ContextChunk[];
  
  // Display selected files as context
  // Show current relevant segments
  // Jump to transcript functionality
</script>
```

### 4. Enhanced Gallery Integration (`frontend/src/routes/MediaLibrary.svelte`)
```svelte
<\!-- Add to existing selection controls -->
<div class="selection-controls">
  <\!-- Existing buttons -->
  {#if selectedFiles.size > 0}
    <button 
      class="chat-btn"
      on:click={startChatSession}
      title="Start AI chat with selected {selectedFiles.size} file{selectedFiles.size === 1 ? '' : 's'}"
    >
      <ChatIcon />
      Start AI Chat ({selectedFiles.size})
    </button>
  {/if}
</div>
```

## Database Schema

### Chat Session Management (Optional - for history)
```sql
-- Optional: For persistent chat history
CREATE TABLE chat_session (
    id VARCHAR(255) PRIMARY KEY,
    user_id INTEGER NOT NULL REFERENCES "user"(id) ON DELETE CASCADE,
    title VARCHAR(255),
    file_ids INTEGER[] NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    expires_at TIMESTAMP WITH TIME ZONE,
    INDEX(user_id),
    INDEX(created_at)
);

CREATE TABLE chat_message (
    id SERIAL PRIMARY KEY,
    session_id VARCHAR(255) NOT NULL REFERENCES chat_session(id) ON DELETE CASCADE,
    role VARCHAR(20) NOT NULL, -- 'user' or 'assistant'
    content TEXT NOT NULL,
    context_used JSONB, -- Citations and context chunks used
    created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
    INDEX(session_id),
    INDEX(created_at)
);
```

## API Endpoints

### Chat Session Management
- `POST /api/chat/sessions` - Create new chat session with selected files
- `GET /api/chat/sessions/{session_id}` - Get session details
- `DELETE /api/chat/sessions/{session_id}` - Delete session
- `GET /api/chat/sessions` - List user's chat sessions (if history enabled)

### Chat Interaction
- `POST /api/chat/sessions/{session_id}/messages` - Send message (alternative to WebSocket)
- `GET /api/chat/sessions/{session_id}/messages` - Get conversation history
- `WebSocket /ws/chat/{session_id}` - Real-time chat communication

### Context & Search
- `POST /api/chat/search` - Search within session context
- `GET /api/chat/sessions/{session_id}/context` - Get current context files
- `POST /api/chat/sessions/{session_id}/regenerate` - Regenerate last response

## RAG Implementation Details

### Context Retrieval Strategy
```python
async def retrieve_context(self, query: str, file_ids: List[int]) -> List[ContextChunk]:
    # 1. Hybrid Search (Keyword + Semantic)
    keyword_results = await self.opensearch.search(
        query=query, 
        file_ids=file_ids,
        search_type="keyword"
    )
    
    # 2. Semantic Search (if embeddings available)
    if self.embeddings_enabled:
        semantic_results = await self.opensearch.vector_search(
            query_embedding=await self.embed_query(query),
            file_ids=file_ids
        )
        
    # 3. Combine and rank results
    combined_results = self.rank_results(keyword_results, semantic_results)
    
    # 4. Add contextual window
    enriched_results = []
    for result in combined_results[:10]:
        context_window = await self.get_surrounding_context(result.segment_id)
        enriched_results.append(ContextChunk(
            text=result.text,
            speaker=result.speaker,
            timestamp=result.timestamp,
            file_id=result.file_id,
            context_window=context_window,
            relevance_score=result.score
        ))
    
    return enriched_results
```

### LLM Prompt Template
```python
CHAT_SYSTEM_PROMPT = """
You are an AI assistant helping users analyze and understand their transcript content. 
You have access to transcript segments from the user's selected media files.

IMPORTANT GUIDELINES:
1. Base your responses on the provided transcript context
2. Always cite specific speakers, timestamps, and files when referencing information
3. If information isn't in the provided context, clearly state this
4. Provide specific quotes when relevant
5. Be conversational but accurate
6. Suggest follow-up questions when appropriate

CONTEXT FILES:
{file_context}

RELEVANT TRANSCRIPT SEGMENTS:
{transcript_context}

CONVERSATION HISTORY:
{conversation_history}

USER QUESTION: {user_query}

Provide a helpful, accurate response based on the transcript content above.
"""
```

### Citation Format
```typescript
interface ChatResponse {
  content: string;
  citations: Citation[];
  suggestions: string[];
}

interface Citation {
  file_id: number;
  filename: string;
  speaker: string;
  timestamp: string;
  text: string;
  relevance_score: number;
}
```

## UI/UX Design

### ChatGPT-like Interface Features
1. **Message Threading**
   - User messages aligned right
   - AI responses aligned left
   - Timestamps and status indicators

2. **Interactive Elements**
   - Copy message button
   - Regenerate response option
   - Citation links to transcript segments
   - Suggested follow-up questions

3. **Real-time Features**
   - Typing indicators during AI processing
   - Streaming response display
   - Connection status indicators

4. **Context Display**
   - Selected files sidebar
   - Current context highlights
   - Quick jump to transcript segments

### Modal vs Full-Page Design
**Recommended: Modal Approach**
- Overlay on gallery view
- Maintain context of selected files
- Easy to close and return to selection
- Better for quick questions

**Alternative: Full-Page**
- Dedicated chat route `/chat/{session_id}`
- More space for complex conversations
- Better for extended analysis sessions

## Configuration

### Environment Variables
```bash
# Chat Configuration
CHAT_ENABLED=true
CHAT_SESSION_TIMEOUT=3600  # 1 hour
CHAT_MAX_CONTEXT_LENGTH=8000
CHAT_MAX_HISTORY_MESSAGES=20

# RAG Configuration
RAG_CHUNK_SIZE=500
RAG_CHUNK_OVERLAP=50
RAG_MAX_CHUNKS=10
RAG_SIMILARITY_THRESHOLD=0.7

# LLM Configuration (from Issue #51)
LLM_PROVIDER=openai
LLM_MODEL=gpt-3.5-turbo
LLM_MAX_TOKENS=2000
LLM_TEMPERATURE=0.3

# WebSocket Configuration
WS_CHAT_MAX_CONNECTIONS=100
WS_CHAT_PING_INTERVAL=30
```

## Implementation Phases

### Phase 1: Core Infrastructure (Sprint 1-2)
- [ ] Chat session management service
- [ ] Basic RAG service with OpenSearch integration
- [ ] WebSocket chat handler
- [ ] API endpoints for session management
- [ ] Basic message processing

**Acceptance Criteria:**
- Can create chat sessions with selected files
- Basic question-answering works with transcript context
- WebSocket communication functional
- Context retrieval from OpenSearch accurate

### Phase 2: Frontend Interface (Sprint 3)
- [ ] ChatGPT-like UI components
- [ ] Gallery integration with chat button
- [ ] Real-time message display
- [ ] Citation links to transcript segments
- [ ] Copy/paste functionality

**Acceptance Criteria:**
- Chat interface matches ChatGPT user experience
- Messages stream in real-time
- Citations link correctly to transcript segments
- Copy functionality works reliably
- UI responsive on mobile and desktop

### Phase 3: Enhanced RAG (Sprint 4)
- [ ] Hybrid search implementation
- [ ] Context window optimization
- [ ] Multi-turn conversation handling
- [ ] Response quality improvements
- [ ] Performance optimization

**Acceptance Criteria:**
- Responses relevant and accurate
- Conversation context maintained across turns
- Search performance under 2 seconds
- High-quality context retrieval

### Phase 4: Advanced Features (Sprint 5)
- [ ] Chat history persistence (optional)
- [ ] Advanced search within chat
- [ ] Export chat functionality
- [ ] Analytics and usage tracking
- [ ] Mobile optimization

**Acceptance Criteria:**
- Chat history accessible across sessions
- Export works in multiple formats
- Mobile experience optimized
- Usage analytics available

## Testing Strategy

### Unit Tests
- RAG service context retrieval accuracy
- Chat session management
- Message processing and response generation
- Citation generation and linking
- WebSocket connection handling

### Integration Tests
- End-to-end chat workflow
- OpenSearch + LLM integration
- Frontend + Backend communication
- Multi-file context handling
- Real-time message streaming

### Performance Tests
- Context retrieval speed
- Concurrent chat sessions
- Large transcript handling
- WebSocket connection limits
- Memory usage optimization

### User Experience Tests
- Chat interface usability
- Response quality assessment
- Citation accuracy verification
- Mobile responsiveness
- Accessibility compliance

## Security Considerations

1. **Session Security**
   - Session-based authentication
   - File access verification per user
   - Rate limiting on chat requests
   - WebSocket connection limits

2. **Data Privacy**
   - Context data handling
   - LLM provider data policies
   - Local vs cloud processing options
   - User consent for AI processing

3. **Input Validation**
   - Message content sanitization
   - File ID validation
   - Session ownership verification
   - XSS prevention in chat interface

## Success Metrics

1. **Functionality**
   - 95%+ successful chat sessions
   - <3 second average response time
   - 90%+ citation accuracy
   - Support for 10+ concurrent users

2. **User Experience**
   - <2 second chat interface load time
   - 95%+ message delivery success rate
   - Positive user feedback on response quality
   - High task completion rates

3. **Adoption**
   - 60%+ of users try chat feature
   - 40%+ use chat regularly
   - Average 5+ messages per session
   - 80%+ user satisfaction rating

## Future Enhancements

1. **Advanced AI Features**
   - Multi-modal chat (text + audio)
   - Voice-to-text chat input
   - Automated question suggestions
   - Sentiment-aware responses

2. **Collaboration Features**
   - Shared chat sessions
   - Team knowledge base
   - Chat templates for common queries
   - Integration with business tools

3. **Analytics & Insights**
   - Chat usage analytics
   - Popular query patterns
   - Response quality metrics
   - Content gap identification

## Files to Create/Modify

### New Backend Files
- `backend/app/services/chat_service.py`
- `backend/app/services/rag_service.py`
- `backend/app/services/opensearch_chat_service.py`
- `backend/app/api/chat_websocket.py`
- `backend/app/api/endpoints/chat.py`
- `backend/app/schemas/chat.py`
- `backend/app/models/chat.py` (if history enabled)

### New Frontend Files
- `frontend/src/components/ChatInterface.svelte`
- `frontend/src/components/ChatMessage.svelte`
- `frontend/src/components/ChatContextPanel.svelte`
- `frontend/src/components/ChatModal.svelte`
- `frontend/src/lib/types/chat.ts`
- `frontend/src/lib/services/chatService.ts`
- `frontend/src/stores/chat.ts`

### Modified Files
- `frontend/src/routes/MediaLibrary.svelte` - Add chat button to selection controls
- `backend/app/api/router.py` - Include chat endpoints
- `backend/app/core/config.py` - Add chat configuration
- `backend/app/services/opensearch_service.py` - Extend for RAG functionality
- `database/init_db.sql` - Add chat tables (if history enabled)

## Priority

**High Priority** - This feature transforms the application from a passive transcription tool into an interactive AI assistant, significantly increasing user engagement and value proposition. It leverages existing infrastructure while providing a modern, ChatGPT-like experience that users expect from AI applications.

## Dependencies

1. **Infrastructure** (Already Available)
   - OpenSearch cluster for transcript search
   - WebSocket infrastructure for real-time communication
   - LLM integration from Issue #51
   - Selection system in MediaLibrary.svelte

2. **External Services**
   - LLM provider APIs (OpenAI, vLLM, Ollama, Claude)
   - Optional: Embedding service for semantic search

3. **Performance Requirements**
   - Sufficient server resources for concurrent chat sessions
   - OpenSearch cluster capacity for real-time search
   - WebSocket connection handling

## Labels

`enhancement`, `ai-integration`, `chat`, `rag`, `opensearch`, `high-priority`, `backend`, `frontend`, `websocket`, `user-experience`

---
**Reporter:** Claude Code Assistant  
**Epic:** AI-Powered Interactive Features  
**Component:** Chat & RAG System  
**Estimated Effort:** 5 sprints (25-30 story points)  
**Related Issues:** #51 (LLM Integration), #29 (OpenSearch Enhancement)

Feature: AI Interactive Chat with RAG System for Selected Transcripts #52

Description

Feature Summary

Problem Statement

Current State Analysis

Existing Infrastructure

Selection Infrastructure (Already Available)

Proposed Solution

User Experience Flow

Technical Architecture

RAG System with OpenSearch

OpenSearch RAG Implementation

Hybrid Search Strategy

Technical Implementation

Phase 1: Core Chat Infrastructure

Phase 2: Frontend Chat Interface

Phase 3: Advanced Features

Backend Architecture

1. Chat Session Service (backend/app/services/chat_service.py)

2. RAG Service (backend/app/services/rag_service.py)

3. Enhanced OpenSearch Service (backend/app/services/opensearch_chat_service.py)

4. WebSocket Chat Handler (backend/app/api/chat_websocket.py)

Frontend Architecture

1. Chat Interface (frontend/src/components/ChatInterface.svelte)

2. Chat Message Component (frontend/src/components/ChatMessage.svelte)

3. Context Panel (frontend/src/components/ChatContextPanel.svelte)

4. Enhanced Gallery Integration (frontend/src/routes/MediaLibrary.svelte)

Database Schema

Chat Session Management (Optional - for history)

API Endpoints

Chat Session Management

Chat Interaction

Context & Search

RAG Implementation Details

Context Retrieval Strategy

LLM Prompt Template

Citation Format

UI/UX Design

ChatGPT-like Interface Features

Modal vs Full-Page Design

Configuration

Environment Variables

Implementation Phases

Phase 1: Core Infrastructure (Sprint 1-2)

Phase 2: Frontend Interface (Sprint 3)

Phase 3: Enhanced RAG (Sprint 4)

Phase 4: Advanced Features (Sprint 5)

Testing Strategy

Unit Tests

Integration Tests

Performance Tests

User Experience Tests

Security Considerations

Success Metrics

Future Enhancements

Files to Create/Modify

New Backend Files

New Frontend Files

Modified Files

Priority

Dependencies

Labels

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

1. Chat Session Service (`backend/app/services/chat_service.py`)

2. RAG Service (`backend/app/services/rag_service.py`)

3. Enhanced OpenSearch Service (`backend/app/services/opensearch_chat_service.py`)

4. WebSocket Chat Handler (`backend/app/api/chat_websocket.py`)

1. Chat Interface (`frontend/src/components/ChatInterface.svelte`)

2. Chat Message Component (`frontend/src/components/ChatMessage.svelte`)

3. Context Panel (`frontend/src/components/ChatContextPanel.svelte`)

4. Enhanced Gallery Integration (`frontend/src/routes/MediaLibrary.svelte`)