Skip to content

fix: use browser speechSynthesis for playback when browser-native-tts is selected#28

Open
YizukiAme wants to merge 1 commit intoTHU-MAIC:mainfrom
YizukiAme:fix/browser-native-tts-playback
Open

fix: use browser speechSynthesis for playback when browser-native-tts is selected#28
YizukiAme wants to merge 1 commit intoTHU-MAIC:mainfrom
YizukiAme:fix/browser-native-tts-playback

Conversation

@YizukiAme
Copy link

Summary

Fix browser-native TTS producing no sound during classroom playback, while the settings test plays sound correctly.

Fixes #25, fixes #12, fixes #5

Root Cause

When browser-native-tts is selected as the TTS provider:

  1. Scene generation (use-scene-generator.ts:214,450) correctly skips pre-generating audio — browser TTS runs client-side via Web Speech API, not via server API
  2. Playback (engine.ts:436-444) calls audioPlayer.play() which finds no pre-generated audio in IndexedDB → returns false → falls back to scheduleReadingTimer() — a silent timer that estimates reading time but never calls speechSynthesis

Fix

Add Web Speech API integration directly in PlaybackEngine (lib/playback/engine.ts):

  • playBrowserTTS() — speaks text via window.speechSynthesis, respecting user's voice, speed, volume, and mute settings
  • cancelBrowserTTS() — cancels active browser TTS
  • pause() — calls speechSynthesis.pause() when browser TTS is active
  • resume() — calls speechSynthesis.resume() when browser TTS is paused
  • stop() / handleUserInterrupt() — calls speechSynthesis.cancel() to stop browser TTS

The fix is self-contained in one file. When audioPlayer.play() returns false (no pre-generated audio), the engine now checks if browser-native-tts is the selected provider and calls speechSynthesis.speak() instead of falling back to the silent reading timer.

Changes

File Change
lib/playback/engine.ts +83 lines, -4 lines

Testing

  • Set TTS Provider to "Browser Native" → settings test plays sound ✅
  • Generate classroom → play → sound plays
  • Pause/resume works correctly
  • No regression for other TTS providers (they still use pre-generated audio path)

@cosarah
Copy link
Collaborator

cosarah commented Mar 17, 2026

Code Review

Clean, focused change with correct edge case handling. Overall LGTM. Two items to consider as follow-up improvements:

1. Chrome long utterance cutoff

Chrome has a known bug where SpeechSynthesisUtterance longer than ~15 seconds gets silently cut off. If a speechAction.text is long, playback may stop mid-sentence and onend won't fire, causing the playback engine to hang. A follow-up PR could add a workaround (e.g., chunking text or adding a timeout watchdog).

2. speechSynthesis.pause()/resume() cross-browser compatibility

Firefox has incomplete support for speechSynthesis.pause() / speechSynthesis.resume(), which may cause playback to not recover after pausing. This is a Web Speech API platform limitation and doesn't affect the correctness of this PR, but worth noting.

@YizukiAme
Copy link
Author

YizukiAme commented Mar 17, 2026

Thanks for the review!!!! I'll address the Chrome 15s cutoff and Firefox pause/resume issues in a follow-up PR by implementing an utterance queue with text chunking. This will elegantly handle both issues while keeping this PR focused on the basic fallback.

… is selected

Previously, selecting browser-native-tts as the TTS provider would
produce sound in the settings test but remain silent during classroom
playback. This happened because:

1. The scene generator correctly skipped pre-generation for browser TTS
   (it runs client-side, not via API)
2. The playback engine fell back to a silent reading timer when no
   pre-generated audio was found, instead of calling speechSynthesis

This commit adds Web Speech API integration directly in the
PlaybackEngine:
- New playBrowserTTS() method speaks text via speechSynthesis
- Properly wires onend/onerror to advance to the next action
- pause()/resume() now handle speechSynthesis.pause()/resume()
- stop() and handleUserInterrupt() cancel browser TTS

Fixes THU-MAIC#25, fixes THU-MAIC#12, fixes THU-MAIC#5
@YizukiAme YizukiAme force-pushed the fix/browser-native-tts-playback branch from 49b470f to 496b5d9 Compare March 17, 2026 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: 浏览器原生 TTS (browser-native-tts) 播放时没有声音 课程播放过程中没有声音 ask about the sound pronblems

2 participants