kokorox - fast Kokoro TTS in Rust

Rust implementation of the Kokoro text-to-speech model. Small model (87M parameters), high quality output, very fast inference.

Features

Multi-language: English, Chinese, Japanese, Spanish, French, and more via espeak-ng
Voice style mixing (e.g., af_sky.4+af_nicole.5)
OpenAI-compatible API server
Streaming and pipe modes for LLM integration
Automatic language detection

Quick Start

# Install (macOS)
brew install byteowlz/tap/koko

# Or download from GitHub Releases
# https://github.com/byteowlz/kokorox/releases

# Generate speech
koko text "Hello, this is a test"

# Output: tmp/output.wav

Installation

Pre-built Binaries

Download from GitHub Releases for Linux, macOS, and Windows.

From Source

Requires ONNX runtime and espeak-ng:

# macOS
brew install espeak-ng

# Ubuntu/Debian
sudo apt-get install espeak-ng libespeak-ng-dev

Build:

git clone https://github.com/byteowlz/kokorox.git
cd kokorox
pip install -r scripts/requirements.txt
python scripts/download_voices.py --all
cargo build --release

ONNX Runtime (Linux with NVIDIA GPU)

tar -xzf onnxruntime-linux-x64-gpu-1.22.0.tgz
sudo cp -a onnxruntime-linux-x64-gpu-1.22.0/include /usr/local/
sudo cp -a onnxruntime-linux-x64-gpu-1.22.0/lib /usr/local/
sudo ldconfig
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

Usage

Basic

koko text "Hello, world!" -o greeting.wav
koko file poem.txt                          # One wav per line

Multi-language

koko text "Hola, mundo!" --lan es
koko text "你好，世界!" --lan zh
koko -a text "Bonjour!"                     # Auto-detect language

Voice Styles

koko voices                                 # List available voices
koko voices --language en --gender female   # Filter voices
koko text "Hello" --style af_sky
koko text "Hello" --style af_sky.4+af_nicole.5  # Mix styles

Pipe Mode (LLM Integration)

ollama run llama3 "Tell me a story" | koko pipe
ollama run llama3 "Explain physics" | koko pipe --silent -o output.wav

OpenAI-Compatible Server

koko openai --ip 0.0.0.0 --port 3000

curl -X POST http://localhost:3000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{"model": "kokoro", "input": "Hello!", "voice": "af_sky"}' \
  -o hello.wav

curl http://localhost:3000/v1/audio/voices           # List voice IDs
curl http://localhost:3000/v1/audio/voices/detailed  # Voice metadata

Streaming

koko stream > output.wav
# Type text, press Enter. Ctrl+D to exit.

Docker

docker build -t kokorox .
docker run -v ./tmp:/app/tmp kokorox text "Hello from docker!" -o tmp/hello.wav
docker run -p 3000:3000 kokorox openai --ip 0.0.0.0 --port 3000

Debugging

koko text "Text here" --verbose              # Detailed processing logs
koko text "Accénted" --debug-accents         # Character-by-character analysis

Additional Voices

The default installation includes standard voices. More voices (54 total across 8 languages) can be converted from Hugging Face:

python scripts/convert_pt_voices.py --all
koko -d data/voices-custom.bin text "Hello" --style en_sarah

License

GPL 3.0 due to use of the espeak-rs-sys crate which statically links espeak-ng

Name		Name	Last commit message	Last commit date
Latest commit History 215 Commits
.cargo		.cargo
.github/workflows		.github/workflows
.opencode		.opencode
.trx		.trx
checkpoints		checkpoints
data		data
examples		examples
history		history
koko		koko
kokorox-openai		kokorox-openai
kokorox-websocket		kokorox-websocket
kokorox		kokorox
onnxruntime		onnxruntime
scripts		scripts
test		test
voice-pca		voice-pca
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
dist-workspace.toml		dist-workspace.toml
export.md		export.md
justfile		justfile
release.toml		release.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kokorox - fast Kokoro TTS in Rust

Features

Quick Start

Installation

Pre-built Binaries

From Source

ONNX Runtime (Linux with NVIDIA GPU)

Usage

Basic

Multi-language

Voice Styles

Pipe Mode (LLM Integration)

OpenAI-Compatible Server

Streaming

Docker

Debugging

Additional Voices

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

byteowlz/kokorox

Folders and files

Latest commit

History

Repository files navigation

kokorox - fast Kokoro TTS in Rust

Features

Quick Start

Installation

Pre-built Binaries

From Source

ONNX Runtime (Linux with NVIDIA GPU)

Usage

Basic

Multi-language

Voice Styles

Pipe Mode (LLM Integration)

OpenAI-Compatible Server

Streaming

Docker

Debugging

Additional Voices

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages