Skip to content

fix(serve): filter <think> output and support speaking interruption#132

Merged
crrow merged 1 commit intomainfrom
codex/fix-demo-think-filter-bargein
Apr 9, 2026
Merged

fix(serve): filter <think> output and support speaking interruption#132
crrow merged 1 commit intomainfrom
codex/fix-demo-think-filter-bargein

Conversation

@crrow
Copy link
Copy Markdown
Contributor

@crrow crrow commented Apr 9, 2026

Summary

  • filter streamed <think>...</think> content from assistant visible text and TTS sentence chunks
  • keep speech recognition alive during speaking so users can barge in while assistant audio is playing
  • add turn-id invalidation guard so stale async turn continuations can't restore incorrect speaking state
  • add regression assertion for think-filter helper in demo page

Validation

  • cargo test serve::tests -- --nocapture
  • Playwright e2e (fake SR + fake LLM with <think>):
    • first turn sends only 第一句。 / 第二句。 to WS (no think text)
    • barge-in sends {"type":"cancel"} and starts second LLM call (LLM_CALLS=2)
    • final status returns to listening…

User-visible Fixes

  • thinking content will no longer be spoken
  • user speech can interrupt assistant while in speaking state

Filter model <think> blocks from streamed output before appending assistant text or sending sentence chunks to TTS, so reasoning text is never spoken.\n\nKeep speech recognition active during assistant speaking and add turn-id guards to prevent stale async turns from restoring an incorrect state. This allows reliable barge-in interruption and avoids getting stuck in speaking.\n\nAlso add a demo-page regression assertion for think filtering.
@crrow crrow merged commit b93d1c9 into main Apr 9, 2026
0 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant