Desktop text-to-speech app using Qwen3-TTS for 100% local voice design, cloning, and generation.
Built with Electrobun.
macOS (Apple Silicon):
- Text-to-Speech - Generate audio from text with multiple language support
- Voice Cloning - Clone any voice from a short audio sample
- Voice Design - Create new voices from text descriptions (e.g. "deep male voice, British accent")
- Built-in Instruct Voices - Predefined speakers with instruction control ("speak warmly", "sound excited")
- Batch Generation - Generate multiple audio files from a script
- Tiny App - Core app is only ~16MB thanks to Electrobun
- Auto-updating - Built-in update mechanism for new releases
All models are downloaded automatically on first use — just select a model size from the dropdown and the app handles the rest. No manual setup required.
| Slot | Purpose | When to load |
|---|---|---|
| Base Text to Speech | Audio generation and voice cloning | Always - required for all generation |
| Create Voice Design/Clone | Create voices from text descriptions | Only when designing a new voice, can unload after |
| Built-in Instruct Voices | Predefined speakers + instruction control | Only when using built-in speakers |
