add parakeet v3 model #3341

asukaminato0721 · 2026-01-28T13:19:34Z

RUST_BACKTRACE=1 cargo run -p candle-examples -r --features cuda --example parakeet -- \
                                                                                  --input ../jfk.wav \
                                                                                  --model-id mlx-community/parakeet-tdt-0.6b-v3 --debug

works.

file comes from https://github.com/ggml-org/whisper.cpp/blob/master/samples/jfk.wav

I only find https://github.com/senstella/parakeet-mlx so I port this.

fix #3247

Copilot

Pull request overview

This pull request adds support for the Parakeet ASR (Automatic Speech Recognition) model family to the candle-transformers library. The implementation is ported from the MLX framework and supports three model variants: TDT (Token-and-Duration Transducer), RNNT (RNN-Transducer), and CTC (Connectionist Temporal Classification). The PR includes a complete implementation with audio preprocessing, Conformer encoder architecture, LSTM-based decoder, and various decoding strategies including greedy and beam search.

Changes:

Adds comprehensive Parakeet model implementation with support for TDT, RNNT, CTC, and hybrid TDT-CTC variants
Implements complete audio preprocessing pipeline with FFT, mel-filterbank, and normalization
Includes streaming inference support with rotating cache mechanisms for efficient long-form transcription
Adds command-line example with support for chunked audio processing and beam search decoding

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
candle-transformers/src/models/parakeet/mod.rs	Module definition exposing public API for Parakeet models
candle-transformers/src/models/parakeet/model.rs	Core model implementations for all Parakeet variants with decoding strategies
candle-transformers/src/models/parakeet/rnnt.rs	LSTM implementation and prediction/joint network components
candle-transformers/src/models/parakeet/conformer.rs	Conformer encoder architecture with self-attention and convolution blocks
candle-transformers/src/models/parakeet/attention.rs	Multi-head attention with relative positional encoding
candle-transformers/src/models/parakeet/cache.rs	Caching mechanisms for streaming inference
candle-transformers/src/models/parakeet/audio.rs	Audio preprocessing including FFT and mel-spectrogram computation
candle-transformers/src/models/parakeet/alignment.rs	Token alignment and sentence segmentation logic
candle-transformers/src/models/parakeet/ctc.rs	CTC decoder implementation
candle-transformers/src/models/parakeet/tokenizer.rs	Simple tokenizer decode function
candle-transformers/src/models/mod.rs	Updated to include parakeet module in models list
candle-examples/examples/parakeet/main.rs	CLI example for running Parakeet inference
candle-examples/examples/parakeet/README.md	Usage documentation for the example

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-28T13:24:51Z

candle-transformers/src/models/parakeet/rnnt.rs

+    layers: Vec<LstmLayer>,
+    hidden_size: usize,
+    num_layers: usize,
+    batch_first: bool,


The batch_first field is stored but never used in the implementation. The forward method always assumes batch-first format (batch, time, features) regardless of this field's value. Consider removing this field or implementing the non-batch-first case if needed.

Copilot · 2026-01-28T13:24:51Z

candle-transformers/src/models/parakeet/conformer.rs

+    kernel_size: usize,
+    padding: usize,
+    sampling_num: usize,
+    subsampling_conv_chunking_factor: isize,


The subsampling_conv_chunking_factor field is stored but never used in the forward method implementation. Consider removing this field or implementing the intended chunking functionality if needed.

Copilot · 2026-01-28T13:24:52Z

candle-transformers/src/models/parakeet/model.rs

+                    .into_values()
+                    .filter(|h| h.step < length)
+                    .collect();
+                active_list.sort_by(|a, b| b.score.partial_cmp(&a.score).unwrap());


The unwrap() calls on partial_cmp results can panic if scores contain NaN values. Consider using unwrap_or(Ordering::Equal) or adding explicit NaN checks before sorting to prevent potential panics during beam search decoding.

Copilot · 2026-01-28T13:24:52Z

candle-transformers/src/models/parakeet/model.rs

+                let best = all
+                    .into_iter()
+                    .max_by(|a, b| {
+                        let score_a =
+                            a.score / (a.hypothesis.len().max(1) as f64).powf(beam.length_penalty);
+                        let score_b =
+                            b.score / (b.hypothesis.len().max(1) as f64).powf(beam.length_penalty);
+                        score_a.partial_cmp(&score_b).unwrap()
+                    })
+                    .unwrap();


The unwrap() calls on partial_cmp results can panic if scores contain NaN values. Consider using unwrap_or(Ordering::Equal) or adding explicit NaN checks to prevent potential panics when selecting the best hypothesis.

Copilot · 2026-01-28T13:24:52Z

candle-transformers/src/models/parakeet/model.rs

+                    token_idx.sort_by(|&a, &b| {
+                        token_logprobs[b].partial_cmp(&token_logprobs[a]).unwrap()
+                    });
+                    token_idx.truncate(beam_token);
+
+                    let mut dur_idx: Vec<usize> = (0..duration_logprobs.len()).collect();
+                    dur_idx.sort_by(|&a, &b| {
+                        duration_logprobs[b]
+                            .partial_cmp(&duration_logprobs[a])
+                            .unwrap()
+                    });
+                    dur_idx.truncate(beam_duration);


The unwrap() calls on partial_cmp results can panic if log probabilities contain NaN values. Consider using unwrap_or(Ordering::Equal) to handle potential NaN cases gracefully during beam search sorting.

Copilot · 2026-01-28T13:24:52Z

candle-transformers/src/models/parakeet/cache.rs

+    pub values: Option<Tensor>,
+    pub conv: Option<Tensor>,
+    pub offset: usize,
+    step: usize,


The step field is declared and initialized but never used in the implementation. Consider removing it or implementing the intended functionality if it was meant to serve a purpose.

danielclough · 2026-01-28T18:32:49Z

Is this well tested?
I also have work on parakeet I can publish to compare notes.
I couldn't get everything perfect across all models, so I didn't submit a PR.

danielclough · 2026-01-28T20:16:34Z

Looking at my notes I see the differences I was getting between pytorch and candle are things like:

# rnnt1b
pytorch: Good morning everyone!
candle: Good morning, everyone. 

pytorch: The art of communication
candle: the art of communication

Have you done extensive testing to confirm you do not have such issues across all models on a variety of audio files?
Getting an 80-90% WER was pretty easy, but hammering out the final details has been very tricky.

The other reason I wasn't sure if @ivarflakstad would want my code in candle is that I have features for using nemo files directly instead of requiring safetensors from community repos.

asukaminato0721 added 10 commits January 28, 2026 16:04

impl

c7c7fd2

.

d409b1a

-1

0c2dbb5

perm

bcebc4e

name

4545dac

name

eb7b57c

perm

50af89d

bias

e31457c

.

17e6771

done

df4464a

Copilot AI review requested due to automatic review settings January 28, 2026 13:19

Copilot started reviewing on behalf of asukaminato0721 January 28, 2026 13:19 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

clean

2551e72

asukaminato0721 changed the title ~~add parakeet model~~ add parakeet v3 model Jan 28, 2026

asukaminato0721 marked this pull request as draft January 28, 2026 19:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add parakeet v3 model #3341

add parakeet v3 model #3341

Uh oh!

asukaminato0721 commented Jan 28, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

Copilot AI Jan 28, 2026

Uh oh!

danielclough commented Jan 28, 2026

Uh oh!

danielclough commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

add parakeet v3 model #3341

Are you sure you want to change the base?

add parakeet v3 model #3341

Uh oh!

Conversation

asukaminato0721 commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

danielclough commented Jan 28, 2026

Uh oh!

danielclough commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

asukaminato0721 commented Jan 28, 2026 •

edited

Loading