This document provides detailed operational information about the backend's major subsystems. Use it for troubleshooting, understanding data flows, and extending functionality.
The backend follows a layered architecture:
Routers (HTTP) → Services (business logic) → Repository (data access)
↓
MCP Tools (external integrations)
↓
OpenRouter API (LLM provider)
Core components:
ChatOrchestrator: Coordinates tool selection, LLM calls, and response streamingOpenRouterClient: HTTP client for OpenRouter API with streaming supportChatRepository: SQLite data layer for conversations and attachmentsMCPToolAggregator: Manages lifecycle of MCP servers and tool discovery- Service layer:
AttachmentService,ModelSettingsService,PresetService, etc.
- Services:
backend.services.model_settings.ModelSettingsService,backend.services.presets.PresetService. - Storage:
data/model_settings.json,data/presets.json. - Key endpoints:
GET /api/settings/model,PUT /api/settings/modelGET /api/settings/system-prompt,PUT /api/settings/system-promptGET /api/presets/,GET /api/presets/{name}POST /api/presets/,PUT /api/presets/{name},DELETE /api/presets/{name}POST /api/presets/{name}/apply
- Flow:
- The frontend model picker persists the selected model through
model_settings_store, keeping the backend and UI in sync. - Presets snapshot the active backend state (model id, provider overrides, parameter overrides, system prompt, and MCP configs) so any client can restore the same environment later.
- When applying a preset the backend updates model settings and pushes new MCP server definitions to the orchestrator.
- The frontend model picker persists the selected model through
- Troubleshooting:
- If presets appear to save the wrong model, confirm the UI successfully persisted the current picker value before snapshotting.
- Inspect
data/model_settings.jsonfor the authoritative active model. - Backend defaults fall back to
OPENROUTER_DEFAULT_MODELand optionalOPENROUTER_SYSTEM_PROMPTon first run.
- Config file:
data/mcp_servers.json(persisted byMCPServerSettingsService). - Runtime:
ChatOrchestratorloads the configs and keeps an instance ofchat.mcp_registry.MCPToolAggregatorwarm. - Defaults: The app bootstraps a calculator, housekeeping utilities, and
Google integrations when no persisted config exists (see
backend.app). - API surface:
GET /api/mcp/serversPUT /api/mcp/serversPOST /api/mcp/servers/refresh
- Operational tips:
- Toggle servers in the UI or via API instead of editing JSON by hand; the aggregator hot-reloads definitions so the running instance stays in sync.
- The aggregator prefixes tool names when multiple servers expose the same tool, which keeps OpenAI-compatible tool payloads conflict-free.
- Tool names are prefixed with their server id (for example,
custom-gmail__gmail_create_draft) to avoid collisions when aggregating multiple MCP integrations.
- Service:
backend.services.attachments.AttachmentServiceuploads bytes to private Google Cloud Storage, records metadata in SQLite, and keeps signed URLs fresh when messages are serialized. - Environment knobs:
ATTACHMENTS_MAX_SIZE_BYTES,ATTACHMENTS_RETENTION_DAYS, and optionalLEGACY_ATTACHMENTS_DIRfor debugging or local development. - Routes:
POST /api/uploads(create + return signed URL), legacy download routes now respond with410 Gone. - Behaviour:
- MCP servers (running on Proxmox) can persist downloads to GCS through the shared attachment service and return signed URLs to the caller.
- Attachment records are associated with chat sessions; touching a message marks referenced files as recently used so retention policies work as expected.
- A background job periodically reaps expired records and deletes the associated blobs from GCS.
- Frontend:
frontend/src/lib/stores/speech.tsand related helpers wire up Deepgram streaming and auto-submit. - Backend:
/api/stt/deepgram/tokenmints temporary keys when the browser cannot hold the long-lived API secret. - Detection strategy:
- Prefer Deepgram's
speech_finalevents to detect the end of an utterance. - Fall back to
UtteranceEndevents if a final result never arrives. - Both paths respect the configurable delay exposed in the speech settings UI so users can fine-tune the behaviour for noisy rooms.
- Prefer Deepgram's
- Configuration: Adjustable parameters live under the speech settings panel (model id, interim results, VAD thresholds, auto-submit delay, etc.). Values are validated before the websocket session is negotiated.
| Path | Purpose |
|---|---|
data/chat_sessions.db |
SQLite store for chat history and attachment metadata |
data/model_settings.json |
Active model configuration |
data/presets.json |
Saved preset snapshots |
data/mcp_servers.json |
Persisted MCP server definitions |
data/suggestions.json |
Saved suggestion templates |
data/uploads/ |
(Legacy) MCP staging area for local file operations |
data/tokens/ |
OAuth tokens minted during Google authorization flows |
All data/ contents are gitignored by default. Do not commit credentials or user data.
Settings are loaded with this priority (highest to lowest):
- Environment variables (
.envfile or system environment) - JSON config files (
data/*.json) - Built-in defaults (defined in
config.py)
Example: Model selection resolves as:
OPENROUTER_DEFAULT_MODELenv var, if setdata/model_settings.json→model_idfield, if present- Fallback to
"openai/gpt-4"hardcoded default
MCP servers are now external services running on Proxmox. To add a new server:
- Deploy the server to Proxmox as a systemd service
- Add the server URL to
data/mcp_servers.json - Test with
POST /api/mcp/servers/refresh - Configure client preferences via the UI
- Create router module in
routers/(e.g.,routers/new_feature.py) - Define FastAPI route handlers with proper type hints
- Use dependency injection for services/repository
- Include router in
app.pyviaapp.include_router() - Add tests in
tests/test_new_feature.py
SQLite schema changes should be handled carefully:
- Create migration SQL in a tracked location (e.g.,
migrations/) - Apply via
aiosqlitein a startup hook or manual script - Test migration on a copy of production data
- Document schema changes in this file
Current schema (simplified):
-- conversations table
CREATE TABLE conversations (
session_id TEXT PRIMARY KEY,
created_at TEXT,
updated_at TEXT
);
-- messages table
CREATE TABLE messages (
id INTEGER PRIMARY KEY,
session_id TEXT,
role TEXT,
content TEXT,
created_at TEXT,
FOREIGN KEY (session_id) REFERENCES conversations(session_id)
);
-- attachments table
CREATE TABLE attachments (
attachment_id TEXT PRIMARY KEY,
session_id TEXT,
gcs_blob TEXT,
mime_type TEXT,
size_bytes INTEGER,
signed_url TEXT,
signed_url_expires_at TEXT,
created_at TEXT,
FOREIGN KEY (session_id) REFERENCES conversations(session_id)
);Logs are structured and written to logs/ with daily rotation:
logs/app/YYYY-MM-DD/- Application logs (INFO level by default)logs/conversations/YYYY-MM-DD/- Detailed conversation logs for debugging
Configure log levels via logging_settings.conf or environment variables.
Key log points:
- Chat requests: Session ID, model, message count
- Tool calls: Tool name, arguments, duration
- Errors: Full stack traces with context
- MCP lifecycle: Server start/stop/errors
- SQLite is sufficient for single-instance deployments
- Consider connection pooling if concurrent load increases
- Use
PRAGMA journal_mode=WALfor better concurrent read performance - Attachments metadata is indexed by
session_idandattachment_id
- Servers are kept alive across requests (process pool)
- Tool discovery is cached until explicit refresh
- Failed servers are marked unavailable but don't block requests
- Large tool responses (>1MB) may cause memory pressure
- SSE responses use chunked transfer encoding
- OpenRouter client maintains persistent HTTP connections
- Tool results are streamed incrementally when possible
- Memory usage scales with concurrent session count
- Signed URLs are cached in database to minimize API calls
- Upload validation happens in-memory before GCS write
- Consider GCS lifecycle policies for automatic cleanup beyond retention period
- Batch delete operations for cleanup jobs
- Never log API keys or tokens - sanitize before writing logs
- Validate all file uploads - enforce type and size limits
- Use signed URLs with short TTL - default 7 days for attachments
- Sanitize user content before tool calls to prevent injection
- OAuth tokens are stored with restrictive file permissions
- GCS bucket should be private with IAM-based access only
Possible causes:
- OpenRouter API slow/down - check their status page
- MCP tool blocking on I/O - check tool logs
- Network issues - verify connectivity
Debug steps:
- Check
logs/app/for errors or timeouts - Test OpenRouter directly:
curl https://openrouter.ai/api/v1/models - Disable MCP servers and retry:
PUT /api/mcp/serverswith empty array - Increase timeout in
config.py→openrouter_timeout
Possible causes:
- Signed URL expired
- GCS bucket permissions incorrect
- Network/firewall blocking GCS
Debug steps:
- Check attachment record in database:
SELECT * FROM attachments WHERE attachment_id = ? - Verify
signed_url_expires_atis in the future - Test URL directly in browser (should prompt download or show image)
- Check service account has
storage.objects.getpermission - Force URL refresh: fetch chat history again (triggers automatic re-signing)
Possible causes:
- Server failed to start
- Missing environment variables
- Configuration syntax error
Debug steps:
- Check
GET /api/mcp/serversfor server status - Review
logs/app/for server startup errors - Verify required env vars:
env | grep OPENROUTER(example) - Validate
data/mcp_servers.jsonsyntax - Manual refresh:
POST /api/mcp/servers/refresh
Possible causes:
- Stale test database
- File permissions
- Concurrent test execution
Debug steps:
- Clean test artifacts:
rm -rf tests/data/*.db - Re-run tests:
uv run pytest -v - Check file permissions:
ls -la tests/data/ - Run tests serially:
pytest -n 0
# Backup
sqlite3 data/chat_sessions.db ".backup data/chat_sessions_backup.db"
# Restore
cp data/chat_sessions_backup.db data/chat_sessions.dbUse gsutil or GCS console to replicate bucket:
gsutil -m cp -r gs://source-bucket gs://backup-bucket# Backup all configs
tar -czf config_backup.tar.gz data/*.json credentials/MCP servers are now external services deployed to Proxmox. To create a new server:
- Create a new repository with FastMCP
- Implement tools using
@mcp.tool()decorators - Deploy as a systemd service on Proxmox
- Add the server URL to
data/mcp_servers.json
Create in src/backend/services/, follow the pattern:
class MyService:
def __init__(self, repository: ChatRepository):
self._repository = repository
async def my_operation(self, ...) -> ...:
# Business logic
passInject via FastAPI dependency in router.
Add to src/backend/routers/:
from fastapi import APIRouter, Depends
router = APIRouter(prefix="/api/my-feature", tags=["my-feature"])
@router.get("/")
async def my_endpoint():
return {"status": "ok"}Include in app.py:
from .routers import my_feature
app.include_router(my_feature.router)Keep these directories under version control only when you need deterministic fixtures; the repository ignores them by default so local state stays local.