feat/switch to pyannote audio by fedirz · Pull Request #628 · speaches-ai/speaches

fedirz · 2026-03-25T02:50:45Z

chore: remove unused piper-phonemize override
deps: update openai package
deps: remove hf-transfer
Removing due to its instability:
Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling huggingface/hf_transfer#63
deps: update required uv version (pt2)
feat: switch from onnx-diarization to pyannote
chore: reduce CUDA image size by switching to nvidia/cuda base image
Switch base image from cudnn-runtime to base variant so torch's bundled
nvidia pip packages are the sole source of CUDA libraries, eliminating
the ~2-4GB duplication that occurred when both the base image and torch
provided the same CUDA toolkit libs.
chore: suppress torchcodec warnings
chore: add speaches-hot-reload task
feat: propagate hf gated model repo errors
deps: add debugpy dev package

Removing due to its instability: huggingface/hf_transfer#63

Switch base image from cudnn-runtime to base variant so torch's bundled nvidia pip packages are the sole source of CUDA libraries, eliminating the ~2-4GB duplication that occurred when both the base image and torch provided the same CUDA toolkit libs.

_scan_cached_repo stores only the basename in file_name, so all README.md files across subdirectories (e.g. embedding/, plda/) match the same filter. Sort by path depth to always select the root-level README.md which contains the model card metadata.

Fixes a segmentation fault (exit code 139) on ubuntu-24.04-x86_64 CI runners. onnxruntime-gpu was crashing inside _create_inference_session when initializing the Silero VAD v5 ONNX model on a CPU-only environment. Updating to the latest onnxruntime version resolves the crash.

Also updates the diarization model

The fixed 0.25s sleep was shorter than the VAD pipeline overhead (audio decode + VAD load + inference ~0.57s), so the Whisper model hadn't been added to loaded_models yet when the DELETE fired.

Fedir Zadniprovskyi added 10 commits March 22, 2026 13:13

chore: remove unused piper-phonemize override

315b7c8

deps: update openai package

c5dda49

deps: remove hf-transfer

781aa71

Removing due to its instability: huggingface/hf_transfer#63

deps: update required uv version (pt2)

538e479

feat: switch from onnx-diarization to pyannote

d24de62

chore: suppress torchcodec warnings

d49f5eb

chore: add speaches-hot-reload task

f0c4cf6

feat: propagate hf gated model repo errors

813476f

deps: add debugpy dev package

ba0111c

fedirz force-pushed the feat/switch-to-pyannote-audio branch from c2317f0 to ba0111c Compare March 25, 2026 03:01

Fedir Zadniprovskyi added 2 commits March 25, 2026 08:57

fedirz force-pushed the feat/switch-to-pyannote-audio branch 2 times, most recently from 778cc28 to 6aa0a96 Compare March 25, 2026 16:32

Fedir Zadniprovskyi added 2 commits March 25, 2026 10:00

feat: add model request param to diarization endpoint

da74ea0

Also updates the diarization model

fix: failing CI tests due to gated model access

92f4727

fedirz force-pushed the feat/switch-to-pyannote-audio branch from 6aa0a96 to e8b1145 Compare March 26, 2026 12:44

fix: replace fragile sleep with polling in model unload test

f145384

The fixed 0.25s sleep was shorter than the VAD pipeline overhead (audio decode + VAD load + inference ~0.57s), so the Whisper model hadn't been added to loaded_models yet when the DELETE fired.

fedirz merged commit 870e1e1 into master Mar 26, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat/switch to pyannote audio#628

feat/switch to pyannote audio#628
fedirz merged 15 commits intomasterfrom
feat/switch-to-pyannote-audio

fedirz commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fedirz commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant