-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Closed
Description
Valid speaker samples are being dropped with text_mismatch because the sample audio is a trimmed window that can span multiple same-speaker segments while the expected text only reflects a single segment. This blocks sample storage and speaker embeddings for affected users.
Current Behavior
verify_and_transcribe_samplecompares transcript against segment text with symmetric similarity.- Trimmed samples that should be valid fail when expected text is longer or spans merged segments.
- Valid samples are dropped due to
text_mismatch.
Expected Behavior
Use a language-agnostic containment check so the transcript can be validated as included in the expected text.
Affected Areas
| File | Line | Description |
|---|---|---|
| backend/utils/speaker_sample.py | 66 | Text mismatch check uses symmetric similarity |
| backend/utils/text_utils.py | 1 | Only similarity helper exists (no containment helper) |
Solution
containment = compute_text_containment(transcript, expected_text)
if containment < MIN_CONTAINMENT:
return transcript, False, f"text_mismatch: containment={containment:.2f}"Files to Modify
- backend/utils/text_utils.py
- backend/utils/speaker_sample.py
- backend/tests/unit/test_speaker_sample.py
- backend/tests/unit/test_text_containment.py
- backend/test.sh
Impact
Low — adds a containment check and tests; existing quality checks remain.
by AI for @beastoin
Metadata
Metadata
Assignees
Labels
No labels