Language Selection is Not Available for Whisper Model

Code

```python
import json
from speechbox import ASRDiarizationPipeline
from pyannote.audio.pipelines.utils.hook import ProgressHook

pipe = ASRDiarizationPipeline.from_pretrained(asr_model="openai/whisper-base", diarizer_model="pyannote/speaker-diarization-3.1")

with ProgressHook() as hook:
    output = pipe("audio.mp3", hook=hook)

json.dump(output, open("output.json", "w"))
```

Output

```
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Due to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`.
```

Question

Where do I specify `generate_kwargs = {"language":"Hindi"}`?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Language Selection is Not Available for Whisper Model #40

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Language Selection is Not Available for Whisper Model #40

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions