Skip to content

Add IndicF5 Support for Indian Languages #339

@hariOneb

Description

@hariOneb

Summary

Would love to see IndicF5 added as a TTS backend in Voicebox! It would bring support for 11 Indian languages including Tamil, Hindi, Bengali, Telugu, Malayalam, Kannada, Gujarati, Marathi, Punjabi, Odia, and Assamese — languages spoken by over 1.4 billion people that are currently not supported by any existing Voicebox backend.

What is IndicF5?

IndicF5 is a near-human quality, open-source Text-to-Speech model by AI4Bharat, trained on 1417 hours of high-quality Indian language speech data.

Why it fits Voicebox perfectly

  • ✅ Pure Python library — integrates cleanly with the existing FastAPI backend
  • ✅ Follows the same reference-audio voice cloning approach as existing backends
  • ✅ Works with PyTorch + CUDA — no new dependencies outside what Voicebox already uses
  • ✅ Installable with a single pip command
  • ✅ Voicebox's modular backends/ architecture in v0.3.0 makes this straightforward to add

How it works

IndicF5 takes 3 inputs — exactly like other Voicebox backends:

  1. Text to synthesize
  2. A reference audio clip (for voice cloning)
  3. Transcript of the reference audio
from transformers import AutoModel
model = AutoModel.from_pretrained("ai4bharat/IndicF5", trust_remote_code=True)
audio = model(text, ref_audio_path="sample.wav", ref_text="transcript")

Impact

There is currently no good local, open-source voice cloning tool for Indian languages. Adding IndicF5 would make Voicebox the go-to tool for Indian language TTS and would open the app to a massive new user base.

Happy to help!

I'm willing to contribute a PR if that helps move this forward. Would love to know if this is something you'd consider including!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions