Skip to content

Optimize LLM Chat Context: Create Resource Query Service to Replace Full DOM Extraction #75

@klappy

Description

@klappy

Issue Description

The current LLM chat implementation sends all DOM content with each chat request, which can be inefficient. Currently, ChatContext.collectCurrentResources() extracts ALL content from the DOM and packages it with every message, including the full chapter (up to 16 verses) plus all translation resources.

Problem Details

  • Current Implementation: Extracts everything from DOM using selectors in collectCurrentResources()
  • Data Volume: Sends scripture (full chapter), translation notes, questions, words, and TWL with every request
  • Inefficiency: Much of this data may not be relevant to the user's specific question
  • Scaling Issue: As more resources are added, the context size grows unnecessarily

Proposed Solution

Implement a resource query service (similar to MCP server concept) that allows the AI to request only relevant information instead of receiving all resources upfront.

Implementation Options

  1. In-App Resource Query Service (Recommended for initial implementation)

    • Create internal service with methods like:
      • queryScriptureVerse(book, chapter, verse)
      • queryTranslationNotes(reference, keyword)
      • queryTranslationWords(term)
      • searchResources(query)
    • Modify chat to send minimal context
    • Let backend determine what resources to fetch
  2. Local MCP Server (For development/Electron)

    • Create true MCP server with resource query tools
    • Only works in development or Electron environments
  3. Backend MCP Integration

    • Move resource queries to Netlify functions
    • Reduce client-server data transfer
    • True server-side MCP implementation
  4. Web Worker Resource Server

    • Use Web Worker as pseudo-MCP server
    • Parallel processing without blocking UI

Acceptance Criteria

  • Resource query service implemented (choose approach)
  • Chat context reduced to minimal required data
  • AI can request specific resources as needed
  • Performance improvement measured and documented
  • Backward compatibility maintained
  • Tests updated for new query patterns
  • Documentation updated with new architecture

Benefits

  • Performance: Reduced data transfer and processing
  • Scalability: Can add more resources without impacting every request
  • Flexibility: AI can be smarter about what context it needs
  • Cost Reduction: Smaller OpenAI API requests = lower costs

Technical Notes

  • Current implementation in src-new/context/ChatContext.jsx
  • Service layer in src-new/services/llmChatService.js
  • Consider starting with Option 1 and evolving to MCP later
  • May require changes to chat backend logic

Priority

Medium - This optimization should be tackled when it becomes the biggest performance/cost issue. Currently the system works well, but this will become more important as usage scales.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions