-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Issue Description
The current LLM chat implementation sends all DOM content with each chat request, which can be inefficient. Currently, ChatContext.collectCurrentResources() extracts ALL content from the DOM and packages it with every message, including the full chapter (up to 16 verses) plus all translation resources.
Problem Details
- Current Implementation: Extracts everything from DOM using selectors in
collectCurrentResources() - Data Volume: Sends scripture (full chapter), translation notes, questions, words, and TWL with every request
- Inefficiency: Much of this data may not be relevant to the user's specific question
- Scaling Issue: As more resources are added, the context size grows unnecessarily
Proposed Solution
Implement a resource query service (similar to MCP server concept) that allows the AI to request only relevant information instead of receiving all resources upfront.
Implementation Options
-
In-App Resource Query Service (Recommended for initial implementation)
- Create internal service with methods like:
queryScriptureVerse(book, chapter, verse)queryTranslationNotes(reference, keyword)queryTranslationWords(term)searchResources(query)
- Modify chat to send minimal context
- Let backend determine what resources to fetch
- Create internal service with methods like:
-
Local MCP Server (For development/Electron)
- Create true MCP server with resource query tools
- Only works in development or Electron environments
-
Backend MCP Integration
- Move resource queries to Netlify functions
- Reduce client-server data transfer
- True server-side MCP implementation
-
Web Worker Resource Server
- Use Web Worker as pseudo-MCP server
- Parallel processing without blocking UI
Acceptance Criteria
- Resource query service implemented (choose approach)
- Chat context reduced to minimal required data
- AI can request specific resources as needed
- Performance improvement measured and documented
- Backward compatibility maintained
- Tests updated for new query patterns
- Documentation updated with new architecture
Benefits
- Performance: Reduced data transfer and processing
- Scalability: Can add more resources without impacting every request
- Flexibility: AI can be smarter about what context it needs
- Cost Reduction: Smaller OpenAI API requests = lower costs
Technical Notes
- Current implementation in
src-new/context/ChatContext.jsx - Service layer in
src-new/services/llmChatService.js - Consider starting with Option 1 and evolving to MCP later
- May require changes to chat backend logic
Priority
Medium - This optimization should be tackled when it becomes the biggest performance/cost issue. Currently the system works well, but this will become more important as usage scales.