fix: filter silence hallucinations from cloud transcription models by SeoFood · Pull Request #23 · TypeWhisper/typewhisper-win

SeoFood · 2026-04-03T23:02:15Z

Summary

Fixes #19. Certain Whisper models (Large V3 Turbo, GPT-4o Transcribe) hallucinate random multi-language text when given silent audio. This adds two layers of silence detection:

Energy gate (pre-filter): Tracks pre-gain peak RMS per audio chunk. If no chunk during the recording exceeds the speech energy threshold (0.01 RMS), transcription is skipped entirely and "No speech detected" is shown. This works correctly even when AGC/WhisperMode amplifies the audio, since it checks the raw microphone level.
no_speech_prob filter (post-filter): Parses the no_speech_prob field from verbose_json Whisper API responses (OpenAI whisper-1, Groq, OpenAI Compatible). If all segments report > 0.8 non-speech probability, the result is discarded as a hallucination.

Both filters apply to the final transcription path (ProcessSingleJobAsync) and the live polling fallback (RunPollingFallbackAsync). WebSocket streaming providers (Deepgram, AssemblyAI) handle silence server-side and are unaffected.

Test plan

Record 2-3 seconds of silence with Groq Whisper V3 Turbo - should show "No speech detected"
Record silence with GPT-4o Transcribe - should show "No speech detected"
Record normal speech - should transcribe correctly (no false positives)
Verify whispered speech is still captured (low energy but above threshold)
Verify local models (Parakeet, Canary) still work correctly via VAD path

…ixes #19) Add two-layer silence detection to prevent Whisper models from hallucinating random multi-language text when given silent audio: 1. Client-side energy gate: check pre-gain peak RMS against a threshold before sending audio to cloud APIs. Skips transcription entirely when no speech energy is detected. 2. Server-side no_speech_prob filter: parse the no_speech_prob field from verbose_json Whisper API responses (OpenAI, Groq, OpenAI Compatible) and discard results where all segments report high non-speech probability (> 0.8).

SeoFood merged commit 5b753cb into main Apr 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: filter silence hallucinations from cloud transcription models#23

fix: filter silence hallucinations from cloud transcription models#23
SeoFood merged 1 commit intomainfrom
issue-19

SeoFood commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

SeoFood commented Apr 3, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant