Description
Add Gemini Live-style voice conversation to the desktop app:
- Web Speech API for real-time STT (browser-side, zero latency)
- LiveKit Agents UI components for voice UI (waveform, controls)
- Server-side TTS (OpenAI) with streaming audio_delta events
- Barge-in support (interrupt Rara mid-speech)
Frontend-first: Steps 1-3 work without backend changes (voice input → text → chat).
Steps 4-7 add TTS backend for voice responses.
Design doc: docs/plans/2026-04-13-realtime-voice-chat.md
Component
web (frontend, UI)
Alternatives considered
- LiveKit full stack (Server + Agent): too heavy, adds unnecessary infrastructure
- Browser speechSynthesis for TTS: robotic voice, not production quality
- Server-side Whisper STT: 1-2s latency vs real-time Web Speech API
Description
Add Gemini Live-style voice conversation to the desktop app:
Frontend-first: Steps 1-3 work without backend changes (voice input → text → chat).
Steps 4-7 add TTS backend for voice responses.
Design doc:
docs/plans/2026-04-13-realtime-voice-chat.mdComponent
web (frontend, UI)
Alternatives considered