feat: always-on wake word capture with unified audio pipeline by kocendavid · Pull Request #302 · OpenWhispr/openwhispr

kocendavid · 2026-02-22T16:11:23Z

Summary

Adds wake word detection that runs continuously regardless of dictation state — when idle it listens for the wake phrase to start dictation; during dictation it listens for the finish phrase to stop
Replaces the previous design where wake word capture stopped during dictation and audioManager used timeslice chunks (which produced broken WebM files without headers)
Uses a stop-start MediaRecorder cycle (3-second chunks) to produce complete, self-contained WebM files that FFmpeg can reliably parse

How it works

State	Behavior
Idle	`useWakeWordCapture` sends 3-sec audio chunks to main process → `wakeWordManager` transcribes with base Whisper model → fuzzy-matches against wake phrase → triggers dictation
Dictation starts	Wake word capture continues running (switches to stop-word mode via `wakeWordRecordingState` IPC)
During dictation	Chunks checked against finish phrase instead of wake phrase
Dictation stops	Seamlessly switches back to wake phrase detection (no restart needed)

Architecture

[Renderer]                          [Main Process]
useWakeWordCapture                  wakeWordManager
  │ 3-sec MediaRecorder chunks        │ separate WhisperServer (base model)
  │ (stop-start cycle = valid WebM)   │ fuzzy phrase matching (Levenshtein)
  └──► wakeWordCheckChunk IPC ──────► └──► transcribe → match → trigger

Key changes

New files

src/helpers/wakeWordManager.js — Main process module: manages a dedicated WhisperServer instance, handles phrase matching with fuzzy matching, auto-downloads the base model
src/hooks/useWakeWordCapture.js — Renderer hook: always-on mic capture using stop-start MediaRecorder cycles, polls wake word status, sends chunks via IPC

Modified files

src/App.jsx — Calls useWakeWordCapture() (no args, runs continuously), notifies main process of recording state
src/helpers/audioManager.js — Removed finish-phrase timeslice code (~20 lines) — the always-on hook handles stop-word detection now
main.js — Initializes WakeWordManager, registers IPC handlers, auto-starts if enabled
preload.js — Exposes wake word IPC API (toggle, set phrases, check chunks, status)
src/helpers/ipcHandlers.js — Routes wake word IPC calls to manager
src/helpers/environment.js — Persists WAKE_WORD_ENABLED, WAKE_WORD_PHRASE, WAKE_WORD_FINISH_PHRASE
src/helpers/windowConfig.js — Disables backgroundThrottling so capture works when window is hidden
src/hooks/useAudioRecording.js — Strips finish phrase from transcribed text
src/components/SettingsPage.tsx — Wake word settings UI (enable/disable, set phrases, live detection log)
src/components/SettingsModal.tsx — Adds Wake Word nav item

What this fixes

The previous approach had a WebM header bug: during dictation, audioManager used MediaRecorder.start(3000) timeslice to send chunks for stop-word detection. Timeslice chunks after the first lack WebM headers, so FFmpeg couldn't parse them. The new design avoids this entirely — useWakeWordCapture uses a stop-start cycle that always produces complete WebM files.

Test plan

Enable wake word in settings, set wake phrase to "whisper"
Say "whisper" → recording starts, wake word capture keeps running (no gap in Listener Output)
Set finish phrase to "done" → say "done" during dictation → dictation stops, text pastes
During dictation, Listener Output shows [FINISH] mode (not [WAKE])
Hotkey still works independently for start/stop
Main transcription produces correct text (single complete blob, no timeslice artifacts)
Disable wake word → capture stops, no microphone indicator

Add wake word detection that runs continuously regardless of dictation state. When idle it listens for the wake phrase to start dictation; during dictation it listens for the finish phrase to stop. This replaces the previous design where wake word capture stopped during dictation and audioManager used timeslice chunks (which produced broken WebM files without headers). Key changes: - wakeWordManager: new main-process module that runs a separate WhisperServer (base model) for lightweight phrase detection - useWakeWordCapture: renderer hook using stop-start MediaRecorder cycles to produce complete WebM files every 3 seconds - audioManager: removed finish-phrase timeslice code — the always-on capture hook now handles stop word detection - Settings UI for enabling wake word, setting wake/finish phrases - backgroundThrottling disabled so capture works when window is hidden Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

kocendavid · 2026-02-22T16:13:18Z

Hey I think this is cool. Try and look if you like it, I will add also differnet words to finish to send or to cancel translation. Maybe some of the stuff is included in the inteligence tab

Add two new configurable finish phrase actions alongside the existing finish phrase: - Cancel phrase: stops dictation and discards audio (no paste) - Enter phrase: stops dictation, pastes text, then simulates Enter key Also fixes false positive wake word matches by using whole-word matching instead of substring includes, and suppresses silence entries from the listener output log. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gabrielste1n self-requested a review February 23, 2026 01:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: always-on wake word capture with unified audio pipeline#302

feat: always-on wake word capture with unified audio pipeline#302
kocendavid wants to merge 2 commits intoOpenWhispr:mainfrom
kocendavid:feat/always-on-wake-word

kocendavid commented Feb 22, 2026 •

edited

Loading

Uh oh!

kocendavid commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

kocendavid commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Architecture

Key changes

New files

Modified files

What this fixes

Test plan

Uh oh!

kocendavid commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kocendavid commented Feb 22, 2026 •

edited

Loading