Skip to content

Releases: KoljaB/RealtimeTTS

v0.6.0

28 Mar 21:14

Choose a tag to compare

RealtimeTTS v0.6.0

Features

  • New TTS Engines

    • Faster Qwen3 TTS: Added FasterQwenEngine engine. See tests/faster_qwen_emotions.py and tests/faster_qwen_test.py for details how to implement. Demo video: https://www.youtube.com/watch?v=ZOKcUpJlrXQ
    • Cartesia: Added engine for Cartesia speech synthesis via WebSocket API. ([#348])
    • MiniMax Cloud TTS: Added MiniMaxEngine for MiniMax T2A v2 API with 2 models (speech-2.8-hd, speech-2.8-turbo), 12 voice presets, and runtime parameter control. ([#369])
    • ModelsLab TTS: Added ModelsLabEngine supporting 30+ voices in 9 languages, speed and emotion control, and lazy loading. ([#368])
    • CAMB AI MARS TTS: Added CambEngine and CambVoice for CAMB AI's MARS models, supporting 140+ languages and streaming output. ([#367])
    • NeuTTS: Added NeuTTSEngine for on-device TTS with voice cloning (3s reference audio, CPU/CUDA/MPS support). ([#359])
    • PocketTTS: Added PocketTTSEngine for CPU-optimized TTS (Kyutai Labs, 8 voices, voice cloning, low latency). ([#358])
  • WebSocket Streaming

    • Real-time TTS streaming via WebSocket endpoint with multi-user support, bidirectional audio, and enhanced web UI. Includes Python demo client. ([#356])

Improvements

  • Engine Usability

    • Allow OpenAIEngine to accept API key as parameter (fallback to env var). ([#361])
    • Add language parameter to ZipvoiceEngine for output and prompt speech. ([#362])
    • Add mpv_audio_device option to AudioConfiguration/TextToAudioStream for MPV playback device selection. ([#327])
    • Add adjustable volume parameter (0.0–1.0) to TextToAudioStream. ([#335])
    • PiperEngine: Streamline synthesis and add samplerate detection from model config for better quality with larger models. ([#346], [#347])
    • Conditional logging: Only print "SYNTHESIS FINISHED" if logging is enabled. ([#332])
  • General

    • Install portaudio for MacOS. ([#328])

Fixes

  • Correct typo in requirements.txt for pypinyin version. ([#355])
  • Fix missing comma in __all__ that affected engine exports. ([#367])
  • Add audio format detection/conversion for non-Kokoro TTS engines. ([#356])
  • Fix voice retrieval errors and improve engine initialization logic. ([#356])

Other

  • Updated documentation for new engines, playback options, and WebSocket usage.
  • Added/updated test files and demo scripts for new engines.
  • No breaking changes; all updates are backward compatible.

PRs:
[#369], [#368], [#367], [#362], [#361], [#359], [#358], [#356], [#355], [#348], [#347], [#346], [#335], [#332], [#328], [#327]

v0.5.7

21 Jul 18:10

Choose a tag to compare

RealtimeTTS v0.5.7

New Engine:

✨ Added ZipVoiceEngine - ZipVoice is small, fast and delivers high-quality output with voice cloning.

Bug Fixes & Improvements:

v0.5.6

27 Jun 08:03

Choose a tag to compare

RealtimeTTS v0.5.6

  • Coqui Engine: More robust error handling

v0.5.5

03 May 21:38

Choose a tag to compare

RealtimeTTS v0.5.5

Bug Fixes & Improvements:

  • Coqui Engine: Enhanced text normalization to improve handling of special characters (leading non-alphanumeric, smart quotes, em-dashes, etc.) and various whitespace types, leading to more reliable synthesis.
  • Stream Status: Fixed logic in TextToAudioStream for more accurate reporting of the stream's active state.

Other Changes:

  • Orpheus Engine: Relaxed strict validation for voice names (validation code commented out).
  • Dependencies: Upgraded openai (to 1.77.0) and edge-tts (to 7.0.2).

v0.5.3

19 Apr 18:47

Choose a tag to compare

RealtimeTTS v0.5.3 Release Notes

Enhancements & Changes:

  • Silence Control: Moved silence‑insertion out of CoquiEngine into TextToAudioStream; introduced configurable comma, sentence, and default silence durations in play- and play-async methods.
  • KokoroEngine: Uses KokoroVoice now. #303 (thank you)
  • OrpheusEngine: Lets you pick model now and has improved stop‑on‑demand checks.
  • More Thread‑Safe Pipes (?): Added SafePipe for what I hope is more reliable inter‑process comm in CoquiEngine (needs to be tested more).

v0.5.1

11 Apr 11:08

Choose a tag to compare

RealtimeTTS v0.5.1 Release Notes

Enhancements & Changes:

  • Audio Trimming & Fading: Trim leading/trailing silence and apply fades to generated audio (configurable in Kokoro & StyleTTS engines).
  • Stop Event Handling: Implemented fast reliable stop of ongoing synthesis especially for the Coqui engine.
  • Dynamic Coqui Settings: Change language and stream chunk size for the Coqui engine on the fly using engine.set_language() and engine.set_stream_chunk_size().
  • Dependency Updates: Upgraded libraries (e.g., openai, kokoro, elevenlabs, etc.) to newer versions.

v0.5.0

28 Mar 15:19

Choose a tag to compare

RealtimeTTS v0.5.0 Release Notes

Enhancements & Changes:

  • Refactored module initialization for on-demand engine loading.
  • Reduced initial import time with lazy loading.
  • Clearer error messages with installation hints for missing dependencies.

v0.4.55

24 Mar 15:21

Choose a tag to compare

RealtimeTTS v0.4.55 Release Notes

  • Enhanced OpenAI Engine Initialization:

    • Added optional parameters: instructions, debug, speed, response_format, and timeout.
    • Updated available voices to include "ash", "coral", and "sage".
    • Note: The speed and timeout parameters are not working at the moment; unclear why - they are being submitted to the API.
  • Text-to-Stream Improvements:

    • Introduced an error_flag to track errors during playback.
    • Set the synthesis worker thread as daemon to ensure proper thread termination.
  • Dependency Update:

    • Upgraded the OpenAI package from version 1.66.3 to 1.68.2.

v0.4.54

22 Mar 20:10

Choose a tag to compare

RealtimeTTS v0.4.54 Release Notes

New Engine:

✨ Added OrpheusEngine - Real-time TTS for Orpheus-3B model with:

  • multiple voice presets (zac, zoe, tara, etc.)
  • emotive speech tags support (<laugh>, <gasp>, etc.)
  • low-latency streaming (<100ms time to first audio token)
  • uses an external server → lets you generate tts on another network system

Installation:

pip install realtimetts[orpheus]

Requires: LM Studio (local) or compatible API server running Orpheus-3B-0.1-ft-Q8_0-GGUF. Load model in LM Studio before use.

Example code:

Here is a code example showcasing how you can use the OrpheusEngine.

v0.4.52

19 Mar 22:11

Choose a tag to compare

RealtimeTTS v0.4.52 Release Notes

  • bugfix for asterisks (*) in the texts breaking synthesis with KokoroEngine (#278)