🗣️ Awesome TTS Models in Google Colab

This project provides easy-to-use Google Colab notebooks for running cutting-edge Text-to-Speech (TTS) models — all powered by free GPUs from Google Colab.

Whether you're experimenting, researching, or just playing around with voice synthesis, these notebooks make it simple to try out top TTS models without worrying about setup or hardware.

Colab Notebooks

Edge TTS

Added at 2025-04-25
GitHub Link
Capabilities: Text-to-speech, Predefined Voices
Note: Not an open-sourced model

xTTS

Added at 2025-05-19
GitHub Link (Original Coqui TTS is no longer maintained as Coqui shut down in 2023.)
Model Link
Capabilities: Text-to-speech, Predefined Voices, Multi-lingual, Voice Cloning from Audio
Languages supported: English (en), Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Polish (pl), Turkish (tr), Russian (ru), Dutch (nl), Czech (cs), Arabic (ar), Chinese (zh-cn), Japanese (ja), Hungarian (hu), Korean (ko) Hindi (hi)
Reason for recommendation: High-quality generation with multi-lingual support and voice cloning from short audio clips.

OpenVoice V2 (Voice Conversion)

Added at 2025-05-19
GitHub Link: myshell-ai/OpenVoice (Used for voice conversion based on reference voice), coqui-tts (Use as base TTS model)
Model Link
Capabilities: Text-to-speech, Multi-lingual, Voice Cloning from Audio
Languages supported: English (en), Spanish (es), French (fr), Chinese (zh-cn), Japanese (ja), Korean (ko)

Parler TTS

Added at 2025-05-19
GitHub Link
Model Link
Capabilities: Text-to-speech, Multi-lingual, Predefined Voices, Guided generation
Languages supported: English, French, Spanish, Portuguese, Polish, German, Italian and Dutch

Kokoro TTS

Added at 2025-05-19
GitHub Link
Model Link
Capabilities: Text-to-speech, Multi-lingual, Predefined Voices
Languages supported: American English (a), British English (b), Spanish (es), French (fr-fr), Hindi (hi), Italian (it), Japanese (ja), Brazilian Portuguese (pt-br), Mandarin Chinese (zh)
Reason for recommendation: Very high-quality generation with multi-lingual support.

Dia 1.6B TTS

Added at 2025-05-19
GitHub Link
Model Link
Capabilities: Text-to-speech, Conversational, Non-verbal sounds, Voice Cloning from Audio
Reasons for recommendation: High-quality generation with conversational and non-verbal sounds.

Auralis xTTS V2

Added at 2025-05-20
GitHub Link
Model Link
Capabilities: Text-to-speech, Predefined Voices, Multi-lingual, Voice Cloning
Languages supported: English (en), Spanish (es), French (fr), German (de), Italian (it), Portuguese (pt), Polish (pl), Turkish (tr), Russian (ru), Dutch (nl), Czech (cs), Arabic (ar), Chinese (zh-cn), Japanese (ja), Hungarian (hu), Korean (ko) Hindi (hi)

Chatterbox TTS

Added at 2025-06-06
GitHub Link
Model Link
Capabilities: Text-to-speech, Emotion Exaggeration Control, Voice Cloning, Watermarked Outputs

Piper TTS

Added at 2025-08-07
GitHub Link
Model Link
Capabilities: Text-to-speech, Multi-language support (20+ languages), Multiple voices, Customizable voices (training support)

Kitten TTS

Added at 2025-08-07
GitHub Link
Model Link (Nano Preview)
Capabilities: Text-to-speech, Multiple Expressive Voices, CPU-compatible, Ultra-small (25MB, 15M params)
Reason for recommendation: Extremely lightweight and fast TTS model suitable for edge devices and real-time applications. Open-source and easy to run locally.

VibeVoice 1.5B TTS

Added at 2025-08-26
GitHub Link
Model Link
Capabilities: Context-Aware Expression, Multi-lingual conversation, Podcast with Background Music, Long Conversational Speech
Languages supported: English, Chinese
Reasons for recommendation: Can generate long-form (up to 90 mins), multi-speaker (up to 4) expressive, conversational audio.

Index TTS V2

Added at 2025-09-17
GitHub Link
Model Link
Capabilities: Emotion-Controlled Speech, Duration-Specific Generation, Zero-Shot Timbre Cloning, Multi-Modal Emotion Guidance, High-Stability Emotional Speech
Reasons for recommendation: Great voice-cloning capability with emotional steering.

Neu TTS Air

Added at 2025-10-21
GitHub Link
Model Link
Capabilities: Real-Time On-Device Speech, Ultra-Realistic Human-Like Voices, Instant Voice Cloning (3s sample), Embedded-Optimized GGUF Format, Secure & Watermarked Output
Reasons for recommendation: Super-realistic, on-device text-to-speech (TTS) language model with instant voice cloning.

Orpheus TTS

Added at 2025-12-02
GitHub Link
Model Link
Capabilities: Text-to-speech, Human-Like Expressive Speech, Zero-Shot Voice Cloning, Guided Emotion & Intonation Tags, Low-Latency Streaming
Languages supported: English
Voices: Multiple preset speaker options (tara, leah, jess, leo, dan, mia, zac, zoe)
Reasons for recommendation: Small LLM-based model, highly expressive, human-like voice generation with zero-shot voice cloning.

Supertonic TTS

Added at 2025-12-05
GitHub Link
Model Link
Capabilities: Text-to-speech, Predefined Voices, Extreme-Speed Inference, Lightweight Deployment, Natural Text Handling, Fully Local Processing
Reason for recommendation: Ultra-lightweight (66M parameters), lightning-fast even on CPU with decent quality, privacy-safe on-device processing.

GLM TTS

Added at 2025-12-15
GitHub Link
Model Link
Capabilities: Text-to-speech, Zero-Shot Voice Cloning, RL-Enhanced Emotion Control, Streaming Real-Time Synthesis, Phoneme-Level Control
Languages supported: English, Chinese, Mixed Language (En/Zh)
Reason for recommendation: LLM-powered TTS with zero-shot voice cloning, RL-tuned emotion control, and streaming capabilities for interactive applications.

Soprano TTS

Added at 2025-12-30
GitHub Link
Model Link
Capabilities: Text-to-speech, Ultra-Fast Real-Time TTS, 32 kHz High-Fidelity Audio, Streaming Inference, Lightweight Deployment, Open-Source
Languages supported: English
Reason for recommendation: Ultra-lightweight (80M parameters), extremely fast (~2000× RTF) with streaming synthesis and sub-frame latency for real-time applications.

Pocket TTS

Added at 2025-12-30
GitHub Link
Model Link
Capabilities: CPU-Based Speech Generation, Voice Cloning, Instant Audio Streaming, Low Latency (~200ms), Faster Than Real-Time (~6x), Python API and CLI, Handles Long Text Inputs
Languages supported: English
Reason for recommendation: Ultra-lightweight (100M parameters), CPU-optimized with ultra-low latency (~200ms) for real-time applications on resource-constrained devices.

Qwen3 TTS

Added at 2025-12-30
Official Blog
Model Link (0.6B)
Model Link (1.7B)
Capabilities: Multilingual TTS, Ultra-Low-Latency Streaming (~97ms), Instruction-Based Voice Control, Rapid 3s Voice Cloning, High-Fidelity Speech Reconstruction
Languages supported: Chinese, English, Japanese, Korean, German, French, Russian, Portuguese, Spanish, Italian
Reason for recommendation: End-to-end discrete LM architecture with extreme low-latency generation, instruction-aware speech synthesis, and strong robustness to noisy or complex text inputs.

📊 Picking TTS Models?

Curious how different TTS models stack up before picking which one to run?

Check out these Hugging Face Spaces with live performance leaderboards:

TTS Arena V2 by TTS-AGI
TTS Arena by TTS-AGI (Replaced by TTS Arena V2)
TTS Spaces Arena by Pendrokar

🙋 Request a Model

Have a favorite TTS model you'd like to see added to this project?
Open an issue or start a discussion to request it!

👀 You Might Be Interested In

If you're interested in running Large Language Models (LLMs) on consumer-level local machines, check out this related project:

Local-LLM-Comparison-Colab-UI:
A collection of Google Colab notebooks for comparing and running various LLMs easily, designed for use on local hardware.
Perfect for exploring and benchmarking LLMs without needing powerful cloud resources!

🤝 Contributing

Contributions to this project are welcome and appreciated! Here's how you can contribute:

Create a Google Colab notebook for a TTS model following the format of existing notebooks
Test your notebook thoroughly to ensure it works properly with Google Colab's free GPU
Fork this repository and add your notebook to the project
Update the README.md to include information about the model following the existing format:
- Add a section with the model name
- Include the Colab badge linking to your notebook
- Add GitHub and model links
- List capabilities and supported languages (if multi-lingual)
Open a Pull Request with your changes

By contributing, you help make advanced TTS technology more accessible to everyone!

📌 Disclaimer

This project is for educational and research purposes. Always verify licenses and model usage terms when using TTS models in production.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.gitignore		.gitignore
Auralis_xTTS.ipynb		Auralis_xTTS.ipynb
Dia_TTS.ipynb		Dia_TTS.ipynb
Edge_TTS.ipynb		Edge_TTS.ipynb
GLM_TTS.ipynb		GLM_TTS.ipynb
Index_TTS_V2.ipynb		Index_TTS_V2.ipynb
Kitten_TTS_Nano.ipynb		Kitten_TTS_Nano.ipynb
LICENSE		LICENSE
Neu_TTS_Air.ipynb		Neu_TTS_Air.ipynb
OpenVoice_V2.ipynb		OpenVoice_V2.ipynb
Orpheus_TTS.ipynb		Orpheus_TTS.ipynb
Parler_TTS.ipynb		Parler_TTS.ipynb
Pocket_TTS.ipynb		Pocket_TTS.ipynb
Qwen3_TTS.ipynb		Qwen3_TTS.ipynb
README.md		README.md
Soprano_TTS.ipynb		Soprano_TTS.ipynb
Supertonic_TTS.ipynb		Supertonic_TTS.ipynb
VibeVoice 1.5B TTS.ipynb		VibeVoice 1.5B TTS.ipynb
chatterbox_TTS.ipynb		chatterbox_TTS.ipynb
kokoro_TTS.ipynb		kokoro_TTS.ipynb
piper1_gpl_TTS.ipynb		piper1_gpl_TTS.ipynb
xTTS.ipynb		xTTS.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🗣️ Awesome TTS Models in Google Colab

Colab Notebooks

Edge TTS

xTTS

OpenVoice V2 (Voice Conversion)

Parler TTS

Kokoro TTS

Dia 1.6B TTS

Auralis xTTS V2

Chatterbox TTS

Piper TTS

Kitten TTS

VibeVoice 1.5B TTS

Index TTS V2

Neu TTS Air

Orpheus TTS

Supertonic TTS

GLM TTS

Soprano TTS

Pocket TTS

Qwen3 TTS

📊 Picking TTS Models?

🙋 Request a Model

👀 You Might Be Interested In

🤝 Contributing

📌 Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🗣️ Awesome TTS Models in Google Colab

Colab Notebooks

Edge TTS

xTTS

OpenVoice V2 (Voice Conversion)

Parler TTS

Kokoro TTS

Dia 1.6B TTS

Auralis xTTS V2

Chatterbox TTS

Piper TTS

Kitten TTS

VibeVoice 1.5B TTS

Index TTS V2

Neu TTS Air

Orpheus TTS

Supertonic TTS

GLM TTS

Soprano TTS

Pocket TTS

Qwen3 TTS

📊 Picking TTS Models?

🙋 Request a Model

👀 You Might Be Interested In

🤝 Contributing

📌 Disclaimer

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages