From b2a56527131f662333d6998b4f24f74d556f33d7 Mon Sep 17 00:00:00 2001
From: martin <martin@techainer.com>
Date: Sat, 24 Jan 2026 22:30:22 +0700
Subject: [PATCH] update voxcpm docs

---
 README.md   |  5 +++--
 docs/api.md | 33 ++++++++++++++++++++++++++-------
 2 files changed, 29 insertions(+), 9 deletions(-)

diff --git a/README.md b/README.md
index 70af96e..2f91a14 100644
--- a/README.md
+++ b/README.md
@@ -146,13 +146,13 @@ curl -N -X POST "http://localhost:8000/api/v1/stt/transcribe/stream?engine=whisp
 **Batch Synthesis**
 
 ```bash
-curl -X POST "http://localhost:8000/api/v1/tts/synthesize?engine=coqui&text=Hello%20world&voice=en-US-1&speed=1.0"
+curl -X POST "http://localhost:8000/api/v1/tts/synthesize?engine=voxcpm&text=Hello%20world"
 ```
 
 **Streaming Synthesis**
 
 ```bash
-curl -N -X POST "http://localhost:8000/api/v1/tts/synthesize/stream?engine=coqui&text=Hello%20world"
+curl -N -X POST "http://localhost:8000/api/v1/tts/synthesize/stream?engine=voxcpm&text=Hello%20world"
 ```
 
 ---
@@ -182,6 +182,7 @@ Detailed documentation is available in the `docs/` directory:
 
 | Engine | Backend | Status | Features |
 | :--- | :--- | :---: | :--- |
+| **VoxCPM** | `voxcpm` | ✅ Ready | Zero-shot voice cloning, streaming, 24kHz |
 | **Coqui TTS** | `TTS` | 🚧 Planned | High-quality open source voices |
 | **OpenAI TTS** | OpenAI API | 🚧 Planned | Natural sounding commercial voices |
 
diff --git a/docs/api.md b/docs/api.md
index 31cd72e..28fd72e 100644
--- a/docs/api.md
+++ b/docs/api.md
@@ -248,7 +248,7 @@ Batch synthesis - convert text to speech audio.
 
 | Parameter | Type | Required | Description |
 |-----------|------|----------|-------------|
-| `engine` | string | Yes | Engine name (e.g., "coqui") |
+| `engine` | string | Yes | Engine name (e.g., "voxcpm") |
 | `text` | string | Yes | Text to synthesize |
 | `voice` | string | No | Voice name/ID to use |
 | `speed` | float | No | Speech speed multiplier (0 < speed <= 3.0, default: 1.0) |
@@ -257,7 +257,7 @@ Batch synthesis - convert text to speech audio.
 **Example:**
 
 ```bash
-curl -X POST "http://localhost:8000/api/v1/tts/synthesize?engine=coqui&text=Hello%20world&voice=en-US-1&speed=1.0"
+curl -X POST "http://localhost:8000/api/v1/tts/synthesize?engine=voxcpm&text=Hello%20world"
 ```
 
 **Response:**
@@ -307,17 +307,15 @@ data: {"audio_data": "<base64-full-audio>", "sample_rate": 22050, "duration_seco
 **Example:**
 
 ```bash
-curl -N -X POST "http://localhost:8000/api/v1/tts/synthesize/stream?engine=coqui&text=Hello%20world"
+curl -N -X POST "http://localhost:8000/api/v1/tts/synthesize/stream?engine=voxcpm&text=Hello%20world"
 ```
 
 **JavaScript Client Example:**
 
 ```javascript
 const params = new URLSearchParams({
-  engine: 'coqui',
-  text: 'Hello, how are you today?',
-  voice: 'en-US-1',
-  speed: '1.0'
+  engine: 'voxcpm',
+  text: 'Hello, how are you today?'
 });
 
 const eventSource = new EventSource(
@@ -495,6 +493,27 @@ curl -X POST "http://localhost:8000/api/v1/stt/transcribe?engine=whisper&engine_
   -F "file=@audio.wav"
 ```
 
+### VoxCPM (Text-to-Speech)
+
+Pass these via `engine_params` query parameter as JSON string.
+
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `prompt_wav_path` | string | null | Path to reference audio for zero-shot voice cloning |
+| `prompt_text` | string | null | Transcript of the reference audio (required with prompt_wav_path) |
+
+**Basic Example:**
+
+```bash
+curl -X POST "http://localhost:8000/api/v1/tts/synthesize?engine=voxcpm&text=Hello%20world"
+```
+
+**Voice Cloning Example:**
+
+```bash
+curl -X POST "http://localhost:8000/api/v1/tts/synthesize?engine=voxcpm&text=Hello%20world&engine_params={\"prompt_wav_path\":\"/path/to/reference.wav\",\"prompt_text\":\"This%20is%20the%20reference%20transcript\"}"
+```
+
 ---
 
 ## Rate Limiting