Skip to content

homingos/flamai-tts-wrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

TTS & Voice Cloning Python Wrapper

A professional, class-based Python wrapper that simplifies interaction with Text-to-Speech (TTS) and Quick Voice Cloning API. This tool correctly handles API authentication, file uploads, voice cloning, and TTS synthesis, including the crucial fix for decoding hexadecimal audio data to prevent static and noise.

Features

  • Object-Oriented Wrapper: Encapsulates all API logic within a clean MiniMaxTTS class.
  • Full Voice Cloning Workflow: Handles the multi-step process of uploading an audio file and creating a new voice clone with a single method call.
  • Direct TTS Synthesis: Easily generate speech using either a newly created voice or any existing voice_id.
  • Robust Error Handling: Includes checks for file-not-found, network issues, and API-specific errors.
  • Clear Separation of Concerns: The reusable wrapper (minimax_tts_wrapper.py) is separate from the interactive usage demo (example.py).

Setup and Configuration

1. Prerequisites

  • Python 3.8 or newer
  • An active Minimax API Key and Group ID

2. Installation

Clone the repository and install the required dependencies:

git clone the repo with url
cd flamai-tts-wrapper
pip install -r requirements.txt

3. Configuration

Open the example.py file and replace the placeholder credentials with your actual Minimax API Key and Group ID.

# example.py

# ...
# IMPORTANT: Fill in your credentials here.
MY_API_KEY = "YOUR_API_KEY_HERE"
MY_GROUP_ID = "YOUR_GROUP_ID_HERE"
# ...

How to Use

There are two primary ways to use this tool: as a library in your own projects or via the interactive demo script.

1. Integration into Your Project (Recommended)

Import the MiniMaxTTS class from the wrapper file and use its methods directly.

from minimax_tts_wrapper import MiniMaxTTS

# --- Configuration ---
API_KEY = "YOUR_API_KEY_HERE"
GROUP_ID = "YOUR_GROUP_ID_HERE"
TEXT_TO_SPEAK = "This is a test of the MiniMax TTS wrapper integration."

# 1. Initialize the client
tts_client = MiniMaxTTS(api_key=API_KEY, group_id=GROUP_ID)

# --- Scenario A: Generate speech with an existing voice ---
print("--- Testing with an existing voice ---")
existing_voice = "CustomVoice1757415581"
output_file_A = "existing_voice_test.mp3"
tts_client.generate_speech(TEXT_TO_SPEAK, existing_voice, output_file_A)


# --- Scenario B: Clone a new voice and then generate speech ---
print("\n--- Testing the full cloning workflow ---")
audio_file_for_cloning = "path/to/your/voice.mp3"
new_voice_id = "MyNewClonedVoice01"
output_file_B = "new_clone_test.mp3"
tts_client.clone_and_generate_speech(
    text=TEXT_TO_SPEAK,
    audio_clone_path=audio_file_for_cloning,
    new_voice_id=new_voice_id,
    output_filename=output_file_B
)

2. Running the Interactive Demo

For quick tests, you can run the example.py script directly from your terminal. Make sure you have an input.txt file and any necessary audio files in the same directory.

python example.py

The script will guide you through the process of selecting a text file, choosing a voice (existing or new clone), and generating the final audio.

Available Sample Voice IDs

You can use these pre-existing voice IDs for testing:

  • CustomVoice1757415581
  • SafeTest1757346484
  • SafeTest1757346345
  • SafeTest1757345815

Wrapper Methods

  • tts_client.generate_speech(text, voice_id, output_filename): Generates audio using a known voice_id.
  • tts_client.clone_and_generate_speech(text, audio_clone_path, new_voice_id, output_filename): A high-level method that handles the entire clone-and-speak workflow.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages