feat(channels): voice transcription quality — annotation + optional LLM correction (#1215) by crrow · Pull Request #1217 · rararulab/rara

crrow · 2026-04-09T04:03:56Z

Summary

Adds two layers of post-processing for STT (voice → text) output in the Telegram and Web channel adapters.

Layer A (always on): prepends [Voice transcription — may contain errors, interpret by context] so the downstream LLM interprets voice input with appropriate error tolerance.
Layer B (opt-in via stt.correction.enabled: true): runs a fast LLM pass (e.g. glm-4-flash) to fix obvious speech-recognition mistakes before delivery. Correction failure is non-fatal — falls back to the raw transcription.

The driver registry is read from KernelHandle::driver_registry() at message-handling time, avoiding extra plumbing through the polling loops.

Configuration

stt:
  base_url: "http://localhost:8080"
  correction:
    enabled: true
    model: "glm-4-flash"
    provider: "glm"

Type of change

Type	Label
New feature	`enhancement`

Component

backend

Closes

Closes #1215

Test plan

cargo check --all --all-targets passes
cargo test -p rara-channels -p rara-stt passes (93 + 6 + 1 + 2 tests)
cargo +nightly fmt --all -- --check passes
cargo clippy --workspace --all-targets --all-features --no-deps -- -D warnings passes
RUSTDOCFLAGS="-D warnings" cargo +nightly doc --workspace --no-deps --document-private-items passes

…LM correction (#1215) Add two layers of post-processing for STT output in voice channels: - Layer A (always on): prepend a hint so the downstream LLM treats voice input as speech-recognised text that may contain errors. - Layer B (opt-in via stt.correction.enabled): run a fast LLM pass to fix obvious transcription mistakes before delivery. Failure is non-fatal — the adapter falls back to the raw transcription. Wires SttCorrectionConfig + the kernel driver registry into the Telegram and Web channel adapters. The driver registry is read from KernelHandle at message-handling time to avoid extra plumbing through polling loops. Closes #1215

crrow added enhancement New feature or request backend Backend/API changes labels Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(channels): voice transcription quality — annotation + optional LLM correction (#1215)#1217

feat(channels): voice transcription quality — annotation + optional LLM correction (#1215)#1217
crrow wants to merge 1 commit intomainfrom
issue-1215-voice-quality

crrow commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

crrow commented Apr 9, 2026

Summary

Configuration

Type of change

Component

Closes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant