28 Mar 21:14

KoljaB

3a74a4f

v0.6.0 Latest

Latest

RealtimeTTS v0.6.0

Features

New TTS Engines
- Faster Qwen3 TTS: Added FasterQwenEngine engine. See tests/faster_qwen_emotions.py and tests/faster_qwen_test.py for details how to implement. Demo video: https://www.youtube.com/watch?v=ZOKcUpJlrXQ
- Cartesia: Added engine for Cartesia speech synthesis via WebSocket API. ([#348])
- MiniMax Cloud TTS: Added MiniMaxEngine for MiniMax T2A v2 API with 2 models (speech-2.8-hd, speech-2.8-turbo), 12 voice presets, and runtime parameter control. ([#369])
- ModelsLab TTS: Added ModelsLabEngine supporting 30+ voices in 9 languages, speed and emotion control, and lazy loading. ([#368])
- CAMB AI MARS TTS: Added CambEngine and CambVoice for CAMB AI's MARS models, supporting 140+ languages and streaming output. ([#367])
- NeuTTS: Added NeuTTSEngine for on-device TTS with voice cloning (3s reference audio, CPU/CUDA/MPS support). ([#359])
- PocketTTS: Added PocketTTSEngine for CPU-optimized TTS (Kyutai Labs, 8 voices, voice cloning, low latency). ([#358])
WebSocket Streaming
- Real-time TTS streaming via WebSocket endpoint with multi-user support, bidirectional audio, and enhanced web UI. Includes Python demo client. ([#356])

Improvements

Engine Usability
- Allow OpenAIEngine to accept API key as parameter (fallback to env var). ([#361])
- Add language parameter to ZipvoiceEngine for output and prompt speech. ([#362])
- Add mpv_audio_device option to AudioConfiguration/TextToAudioStream for MPV playback device selection. ([#327])
- Add adjustable volume parameter (0.0–1.0) to TextToAudioStream. ([#335])
- PiperEngine: Streamline synthesis and add samplerate detection from model config for better quality with larger models. ([#346], [#347])
- Conditional logging: Only print "SYNTHESIS FINISHED" if logging is enabled. ([#332])
General
- Install portaudio for MacOS. ([#328])

Fixes

Correct typo in requirements.txt for pypinyin version. ([#355])
Fix missing comma in __all__ that affected engine exports. ([#367])
Add audio format detection/conversion for non-Kokoro TTS engines. ([#356])
Fix voice retrieval errors and improve engine initialization logic. ([#356])

Other

Updated documentation for new engines, playback options, and WebSocket usage.
Added/updated test files and demo scripts for new engines.
No breaking changes; all updates are backward compatible.

PRs:
[#369], [#368], [#367], [#362], [#361], [#359], [#358], [#356], [#355], [#348], [#347], [#346], [#335], [#332], [#328], [#327]

Assets 2

21 Jul 18:10

KoljaB

v0.5.7

b783c45

v0.5.7

RealtimeTTS v0.5.7

New Engine:

✨ Added ZipVoiceEngine - ZipVoice is small, fast and delivers high-quality output with voice cloning.

see zipvoice_test file for an implementation example
see the zipvoice docker folder for a realtime streaming fastapi server implementation for zipvoice

Bug Fixes & Improvements:

fixes #320

Assets 2

27 Jun 08:03

KoljaB

v0.5.6

01f0a69

v0.5.6

RealtimeTTS v0.5.6

Coqui Engine: More robust error handling

Assets 2

03 May 21:38

KoljaB

v0.5.5

ddb1689

v0.5.5

RealtimeTTS v0.5.5

Bug Fixes & Improvements:

Coqui Engine: Enhanced text normalization to improve handling of special characters (leading non-alphanumeric, smart quotes, em-dashes, etc.) and various whitespace types, leading to more reliable synthesis.
Stream Status: Fixed logic in TextToAudioStream for more accurate reporting of the stream's active state.

Other Changes:

Orpheus Engine: Relaxed strict validation for voice names (validation code commented out).
Dependencies: Upgraded openai (to 1.77.0) and edge-tts (to 7.0.2).

Assets 2

19 Apr 18:47

KoljaB

v0.5.3

ac24b71

v0.5.3

RealtimeTTS v0.5.3 Release Notes

Enhancements & Changes:

Silence Control: Moved silence‑insertion out of CoquiEngine into TextToAudioStream; introduced configurable comma, sentence, and default silence durations in play- and play-async methods.
KokoroEngine: Uses KokoroVoice now. #303 (thank you)
OrpheusEngine: Lets you pick model now and has improved stop‑on‑demand checks.
More Thread‑Safe Pipes (?): Added SafePipe for what I hope is more reliable inter‑process comm in CoquiEngine (needs to be tested more).

Assets 2

11 Apr 11:08

KoljaB

v0.5.1

7e38250

v0.5.1

RealtimeTTS v0.5.1 Release Notes

Enhancements & Changes:

Audio Trimming & Fading: Trim leading/trailing silence and apply fades to generated audio (configurable in Kokoro & StyleTTS engines).
Stop Event Handling: Implemented fast reliable stop of ongoing synthesis especially for the Coqui engine.
Dynamic Coqui Settings: Change language and stream chunk size for the Coqui engine on the fly using engine.set_language() and engine.set_stream_chunk_size().
Dependency Updates: Upgraded libraries (e.g., openai, kokoro, elevenlabs, etc.) to newer versions.

Assets 2

28 Mar 15:19

KoljaB

v0.5.0

e262c9c

v0.5.0

RealtimeTTS v0.5.0 Release Notes

Enhancements & Changes:

Refactored module initialization for on-demand engine loading.
Reduced initial import time with lazy loading.
Clearer error messages with installation hints for missing dependencies.

Assets 2

24 Mar 15:21

KoljaB

v0.4.55

67f5a24

v0.4.55

RealtimeTTS v0.4.55 Release Notes

Enhanced OpenAI Engine Initialization:
- Added optional parameters: instructions, debug, speed, response_format, and timeout.
- Updated available voices to include "ash", "coral", and "sage".
- Note: The speed and timeout parameters are not working at the moment; unclear why - they are being submitted to the API.
Text-to-Stream Improvements:
- Introduced an error_flag to track errors during playback.
- Set the synthesis worker thread as daemon to ensure proper thread termination.
Dependency Update:
- Upgraded the OpenAI package from version 1.66.3 to 1.68.2.

Assets 2

22 Mar 20:10

KoljaB

v0.4.54

9ed1090

v0.4.54

RealtimeTTS v0.4.54 Release Notes

New Engine:

✨ Added OrpheusEngine - Real-time TTS for Orpheus-3B model with:

multiple voice presets (zac, zoe, tara, etc.)
emotive speech tags support (<laugh>, <gasp>, etc.)
low-latency streaming (<100ms time to first audio token)
uses an external server → lets you generate tts on another network system

Installation:

pip install realtimetts[orpheus]

Requires: LM Studio (local) or compatible API server running Orpheus-3B-0.1-ft-Q8_0-GGUF. Load model in LM Studio before use.

Example code:

Here is a code example showcasing how you can use the OrpheusEngine.

Assets 2

19 Mar 22:11

KoljaB

v0.4.52

56b7380

v0.4.52

RealtimeTTS v0.4.52 Release Notes

bugfix for asterisks (*) in the texts breaking synthesis with KokoroEngine (#278)

Assets 2

Releases: KoljaB/RealtimeTTS

v0.6.0

RealtimeTTS v0.6.0

Features

Improvements

Fixes

Other

Uh oh!

v0.5.7

RealtimeTTS v0.5.7

Uh oh!

v0.5.6

RealtimeTTS v0.5.6

Uh oh!

v0.5.5

RealtimeTTS v0.5.5

Uh oh!

v0.5.3

RealtimeTTS v0.5.3 Release Notes

Uh oh!

v0.5.1

RealtimeTTS v0.5.1 Release Notes

Uh oh!

v0.5.0

RealtimeTTS v0.5.0 Release Notes

Uh oh!

v0.4.55

RealtimeTTS v0.4.55 Release Notes

Uh oh!

v0.4.54

RealtimeTTS v0.4.54 Release Notes

Uh oh!

v0.4.52

RealtimeTTS v0.4.52 Release Notes

Uh oh!