Merged
Conversation
added 10 commits
March 22, 2026 13:13
Removing due to its instability: huggingface/hf_transfer#63
Switch base image from cudnn-runtime to base variant so torch's bundled nvidia pip packages are the sole source of CUDA libraries, eliminating the ~2-4GB duplication that occurred when both the base image and torch provided the same CUDA toolkit libs.
c2317f0 to
ba0111c
Compare
added 2 commits
March 25, 2026 08:57
_scan_cached_repo stores only the basename in file_name, so all README.md files across subdirectories (e.g. embedding/, plda/) match the same filter. Sort by path depth to always select the root-level README.md which contains the model card metadata.
Fixes a segmentation fault (exit code 139) on ubuntu-24.04-x86_64 CI runners. onnxruntime-gpu was crashing inside _create_inference_session when initializing the Silero VAD v5 ONNX model on a CPU-only environment. Updating to the latest onnxruntime version resolves the crash.
778cc28 to
6aa0a96
Compare
added 2 commits
March 25, 2026 10:00
Also updates the diarization model
6aa0a96 to
e8b1145
Compare
The fixed 0.25s sleep was shorter than the VAD pipeline overhead (audio decode + VAD load + inference ~0.57s), so the Whisper model hadn't been added to loaded_models yet when the DELETE fired.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
chore: remove unused piper-phonemize override
deps: update openai package
deps: remove hf-transfer
Removing due to its instability:
Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling huggingface/hf_transfer#63
deps: update required uv version (pt2)
feat: switch from onnx-diarization to pyannote
chore: reduce CUDA image size by switching to nvidia/cuda base image
Switch base image from cudnn-runtime to base variant so torch's bundled
nvidia pip packages are the sole source of CUDA libraries, eliminating
the ~2-4GB duplication that occurred when both the base image and torch
provided the same CUDA toolkit libs.
chore: suppress torchcodec warnings
chore: add
speaches-hot-reloadtaskfeat: propagate hf gated model repo errors
deps: add debugpy dev package