LiveKit Whisper STT Plugin

A high-performance speech-to-text plugin for LiveKit agents using OpenAI Whisper with faster-whisper implementation for accurate and efficient speech recognition.

Features

Faster-Whisper Implementation: Optimized inference using faster-whisper for improved performance
High Accuracy: State-of-the-art speech recognition using OpenAI Whisper models
Local Processing: On-device inference with no external API dependencies
Multi-Language Support: Support for 90+ languages with configurable language detection
Warmup Support: Optional model warmup for consistent performance
LiveKit Integration: Seamless integration with LiveKit agents framework

Requirements

LiveKit Agents v1.2 or higher
NVIDIA GPU (recommended for optimal performance)
Python 3.8+
faster-whisper library

Performance

Model	Hardware	Latency	Use Case
Large-v3-Turbo	RTX 4090	<180ms	Real-time applications

Installation

Clone or download this plugin into your LiveKit-based agents project root directory

Install required dependencies:

pip install faster-whisper soundfile numpy

Ensure you have adequate storage for model downloads (models are cached locally)

Usage

Initialize your agent session with the WhisperSTT plugin:

from whisper_plugin import WhisperSTT

session = AgentSession(
    # ... other configuration
    stt=WhisperSTT(
        model="deepdml/faster-whisper-large-v3-turbo-ct2",
        language="en",
        device="cuda",
        compute_type="float16",
    )
)

Language Support

The plugin supports 90+ languages. Common language codes:

# English
stt = WhisperSTT(language="en")

# Spanish
stt = WhisperSTT(language="es")

# French
stt = WhisperSTT(language="fr")

# German
stt = WhisperSTT(language="de")

# Japanese
stt = WhisperSTT(language="ja")

# Auto-detect language
stt = WhisperSTT(language=None)  # Will auto-detect

Device Configuration

# GPU acceleration (recommended)
stt = WhisperSTT(device="cuda", compute_type="float16")

# CPU processing
stt = WhisperSTT(device="cpu", compute_type="float32")

# Auto-select best device
stt = WhisperSTT(device="auto")

Model Warmup

# Enable warmup for consistent performance
stt = WhisperSTT(
    warmup_audio="./sample_audio.wav",  # 5-10 second audio clip
    device="cuda"
)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
whisper.py		whisper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LiveKit Whisper STT Plugin

Features

Requirements

Performance

Installation

Usage

Language Support

Device Configuration

Model Warmup

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LiveKit Whisper STT Plugin

Features

Requirements

Performance

Installation

Usage

Language Support

Device Configuration

Model Warmup

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages