PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
-
Updated
Aug 25, 2021 - Python
PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
This project explores zero-shot emotional speech synthesis using EMOD, a novel approach combining emotion and content embeddings for multilingual and cross-lingual emotion transfer. Built on a VITS-based TTS model, it preserves speaker identity while enhancing expressiveness, enabling emotion transfer across languages and genders efficiently.
Artistic research deconstructing the performative excess of motivational western subcultures
An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
Add a description, image, and links to the emotional-speech-synthesis topic page so that developers can more easily learn about it.
To associate your repository with the emotional-speech-synthesis topic, visit your repo's landing page and select "manage topics."