Skip to content

Commit 9d86a49

Browse files
feat(plugins): add Parakeet TDT speech-to-text plugin
Add a new native plugin for fast English speech recognition using NVIDIA's Parakeet TDT (Token-and-Duration Transducer) 0.6B model via sherpa-onnx. Parakeet TDT is approximately 10x faster than Whisper on consumer hardware with competitive accuracy (#1 on HuggingFace ASR leaderboard). Plugin implementation: - Offline transducer recognizer (encoder/decoder/joiner) via sherpa-onnx C API - Silero VAD v6 for streaming speech segmentation - Recognizer caching keyed on (model_dir, num_threads, execution_provider) - Configurable VAD threshold, silence duration, and max segment length - 16kHz mono f32 audio input, transcription output Justfile additions: - build-plugin-native-parakeet: build the plugin - download-parakeet-models: download INT8 quantized model (~660MB) - setup-parakeet: full setup (sherpa-onnx + models + VAD) - Added parakeet to copy-plugins-native loop Includes sample oneshot pipeline (parakeet-stt.yml) and plugin.yml manifest. Signed-off-by: StreamKit Devin <devin@streamkit.dev> Co-Authored-By: Claudio Costa <cstcld91@gmail.com>
1 parent 4552035 commit 9d86a49

File tree

11 files changed

+2616
-1
lines changed

11 files changed

+2616
-1
lines changed

justfile

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -723,6 +723,43 @@ upload-sensevoice-plugin: build-plugin-native-sensevoice
723723
@curl -X POST -F "plugin=@{{plugins_target_dir}}/release/libsensevoice.so" \
724724
http://127.0.0.1:4545/api/v1/plugins
725725

726+
# Build native Parakeet TDT STT plugin
727+
[working-directory: 'plugins/native/parakeet']
728+
build-plugin-native-parakeet:
729+
@echo "Building native Parakeet TDT STT plugin..."
730+
@CARGO_TARGET_DIR={{plugins_target_dir}} cargo build --release
731+
732+
# Upload Parakeet plugin to running server
733+
[working-directory: 'plugins/native/parakeet']
734+
upload-parakeet-plugin: build-plugin-native-parakeet
735+
@echo "Uploading Parakeet plugin to server..."
736+
@curl -X POST -F "plugin=@{{plugins_target_dir}}/release/libparakeet.so" \
737+
http://127.0.0.1:4545/api/v1/plugins
738+
739+
# Download Parakeet TDT models
740+
download-parakeet-models:
741+
@echo "Downloading Parakeet TDT models..."
742+
@mkdir -p models
743+
@if [ -f models/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2 ]; then \
744+
echo "✓ Parakeet TDT archive already exists"; \
745+
else \
746+
echo "Downloading sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2 (~660MB)..." && \
747+
curl -L -o models/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2 \
748+
https://huggingface.co/csukuangfj/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8/resolve/main/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2 && \
749+
echo "✓ Parakeet TDT archive downloaded"; \
750+
fi
751+
@if [ -d models/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8 ]; then \
752+
echo "✓ Parakeet TDT models already extracted"; \
753+
else \
754+
echo "Extracting models..." && \
755+
cd models && tar xf sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8.tar.bz2 && \
756+
echo "✓ Parakeet TDT models ready at models/sherpa-onnx-nemo-parakeet-tdt-0.6b-v2-int8 (English)"; \
757+
fi
758+
759+
# Setup Parakeet (install dependencies + download models)
760+
setup-parakeet: install-sherpa-onnx download-parakeet-models download-silero-vad
761+
@echo "✓ Parakeet TDT STT setup complete!"
762+
726763
# Download pre-converted NLLB models from Hugging Face
727764
download-nllb-models:
728765
@echo "Downloading pre-converted NLLB-200 models from Hugging Face..."
@@ -1042,7 +1079,7 @@ copy-plugins-native:
10421079

10431080
# Official native plugins (shared target dir).
10441081
# For most plugins the lib stem matches the plugin id.
1045-
for name in whisper kokoro piper matcha vad sensevoice nllb helsinki supertonic slint; do
1082+
for name in whisper kokoro piper matcha vad sensevoice nllb helsinki supertonic slint parakeet; do
10461083
copy_plugin "$name" "$name" "$PLUGINS_TARGET"
10471084
done
10481085

0 commit comments

Comments
 (0)