fix: use browser speechSynthesis for playback when browser-native-tts is selected by YizukiAme · Pull Request #28 · THU-MAIC/OpenMAIC

YizukiAme · 2026-03-16T15:40:56Z

Summary

Fix browser-native TTS producing no sound during classroom playback, while the settings test plays sound correctly.

Fixes #25, fixes #12, fixes #5

Root Cause

When browser-native-tts is selected as the TTS provider:

Scene generation (use-scene-generator.ts:214,450) correctly skips pre-generating audio — browser TTS runs client-side via Web Speech API, not via server API
Playback (engine.ts:436-444) calls audioPlayer.play() which finds no pre-generated audio in IndexedDB → returns false → falls back to scheduleReadingTimer() — a silent timer that estimates reading time but never calls speechSynthesis

Fix

Add Web Speech API integration directly in PlaybackEngine (lib/playback/engine.ts):

playBrowserTTS() — speaks text via window.speechSynthesis, respecting user's voice, speed, volume, and mute settings
cancelBrowserTTS() — cancels active browser TTS
pause() — calls speechSynthesis.pause() when browser TTS is active
resume() — calls speechSynthesis.resume() when browser TTS is paused
stop() / handleUserInterrupt() — calls speechSynthesis.cancel() to stop browser TTS

The fix is self-contained in one file. When audioPlayer.play() returns false (no pre-generated audio), the engine now checks if browser-native-tts is the selected provider and calls speechSynthesis.speak() instead of falling back to the silent reading timer.

Changes

File	Change
`lib/playback/engine.ts`	+83 lines, -4 lines

Testing

Set TTS Provider to "Browser Native" → settings test plays sound ✅
Generate classroom → play → sound plays ✅
Pause/resume works correctly
No regression for other TTS providers (they still use pre-generated audio path)

cosarah · 2026-03-17T01:25:57Z

Code Review

Clean, focused change with correct edge case handling. Overall LGTM. Two items to consider as follow-up improvements:

1. Chrome long utterance cutoff

Chrome has a known bug where SpeechSynthesisUtterance longer than ~15 seconds gets silently cut off. If a speechAction.text is long, playback may stop mid-sentence and onend won't fire, causing the playback engine to hang. A follow-up PR could add a workaround (e.g., chunking text or adding a timeout watchdog).

2. `speechSynthesis.pause()/resume()` cross-browser compatibility

Firefox has incomplete support for speechSynthesis.pause() / speechSynthesis.resume(), which may cause playback to not recover after pausing. This is a Web Speech API platform limitation and doesn't affect the correctness of this PR, but worth noting.

YizukiAme · 2026-03-17T04:23:43Z

Thanks for the review!!!! I'll address the Chrome 15s cutoff and Firefox pause/resume issues in a follow-up PR by implementing an utterance queue with text chunking. This will elegantly handle both issues while keeping this PR focused on the basic fallback.

… is selected Previously, selecting browser-native-tts as the TTS provider would produce sound in the settings test but remain silent during classroom playback. This happened because: 1. The scene generator correctly skipped pre-generation for browser TTS (it runs client-side, not via API) 2. The playback engine fell back to a silent reading timer when no pre-generated audio was found, instead of calling speechSynthesis This commit adds Web Speech API integration directly in the PlaybackEngine: - New playBrowserTTS() method speaks text via speechSynthesis - Properly wires onend/onerror to advance to the next action - pause()/resume() now handle speechSynthesis.pause()/resume() - stop() and handleUserInterrupt() cancel browser TTS Fixes THU-MAIC#25, fixes THU-MAIC#12, fixes THU-MAIC#5

YizukiAme force-pushed the fix/browser-native-tts-playback branch from 49b470f to 496b5d9 Compare March 17, 2026 04:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use browser speechSynthesis for playback when browser-native-tts is selected#28

fix: use browser speechSynthesis for playback when browser-native-tts is selected#28
YizukiAme wants to merge 1 commit intoTHU-MAIC:mainfrom
YizukiAme:fix/browser-native-tts-playback

YizukiAme commented Mar 16, 2026

Uh oh!

cosarah commented Mar 17, 2026 •

edited

Loading

Uh oh!

YizukiAme commented Mar 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

YizukiAme commented Mar 16, 2026

Summary

Root Cause

Fix

Changes

Testing

Uh oh!

cosarah commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review

1. Chrome long utterance cutoff

2. speechSynthesis.pause()/resume() cross-browser compatibility

Uh oh!

YizukiAme commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cosarah commented Mar 17, 2026 •

edited

Loading

2. `speechSynthesis.pause()/resume()` cross-browser compatibility

YizukiAme commented Mar 17, 2026 •

edited

Loading