A user-friendly Steamlit web interface for generating high-quality speech in various Indian languages using the Indic Parler TTS model by AI4Bharat.
- Multilingual Support: Generate speech in over 15 Indian languages including Hindi, English, Bengali, Tamil, Telugu, Marathi, and more.
- Natural Language Control: Control speaker style, pitch, speed, and gender using descriptive text prompts (e.g., "A female speaker with a high pitch speaking fast").
- Interactive GUI: Easy-to-use web interface built with Streamlit.
- Guidance & Presets: Built-in guidance for available speakers per language and one-click style presets.
- Optimized Performance: Leverages
torch.compileand CUDA (if available) for faster inference.
- Python 3.8+
- CUDA-enabled GPU (Recommended for faster generation)
-
Clone the repository:
git clone https://github.com/yourusername/IndicTTS.git cd IndicTTS -
Install dependencies:
We provide a setup script to simplify installation:
chmod +x setup.sh ./setup.sh
You can install the dependencies manually using PowerShell or Command Prompt:
- Install PyTorch:
Visit pytorch.org to get the command for your specific CUDA version. For example (CUDA 12.4):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
- Install Common Libraries:
pip install streamlit transformers soundfile numpy scipy
- Install Parler TTS:
(Requires Git to be installed and available in your PATH)
pip install git+https://github.com/huggingface/parler-tts.git
Note: The Linux/macOS setup script attempts to install Pytorch Nightly with CUDA 13.0 support. Windows users should stick to the stable releases unless they specifically need nightly features.
- Install PyTorch:
Visit pytorch.org to get the command for your specific CUDA version. For example (CUDA 12.4):
Run the Streamlit application:
streamlit run app.pyThe interface will automatically open in your default browser at http://localhost:8501.
- Select a Language: Check the sidebar for recommended speakers for your desired language.
- Enter Text: Type the text you want to convert to speech.
- Describe Voice: Use the "Description" box to define the speaker's voice (e.g., "Amit - Slow, Deep voice") or select a Preset from the dropdown.
- Generate: Click "Generate Audio" and listen to or download the result.
Assamese, Bengali, Bodo, Chhattisgarhi, Dogri, English, Gujarati, Hindi, Kannada, Malayalam, Manipuri, Marathi, Nepali, Odia, Punjabi, Sanskrit, Tamil, Telugu.
This project uses the Indic Parler TTS model developed by AI4Bharat.
- Model: AI4Bharat
- Library: Parler TTS