feat: add VoxCPM TTS engine and update router by phonk2682 · Pull Request #13 · minhsaco99/VoiceCore

phonk2682 · 2026-01-23T19:24:00Z

PR Description

This PR implements the VoxCPM TTS (Text-to-Speech) engine specifically aimed at providing high-quality, tokenizer-free speech synthesis with voice cloning capabilities.

Key changes include:

New Engine Implementation: Added VoxCPMEngine in app/engines/tts/voxcpm/ implementing the BaseTTSEngine interface. Supports both batch synthesis and streaming (with low latency).
Router Upgrade: Updated app/api/routers/tts.py from a placeholder stub to a fully functional router that supports:
- POST /synthesize: Returns base64 encoded WAV audio.
- POST /synthesize/stream: Server-Sent Events (SSE) streaming of audio chunks.
- Request validation and error handling.
Configuration: Registered voxcpm in engines.yaml and added necessary dependencies (voxcpm, torchcodec for cloning) to pyproject.toml.
Testing:
- Added comprehensive unit tests for VoxCPMEngine covering lifecycle, synthesis, streaming, and error edges (Achieved 100% coverage in app/engines/tts/voxcpm/engine.py).
- Updated test_tts_router.py to test the actual router logic instead of expecting 501 errors, aligning test_tts_router.py coverage to 100%.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
New Engine (STT/TTS provider)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Refactor (non-breaking code cleanup or optimization)
Documentation update
Performance improvement

Checklist

I have read the CONTRIBUTING guide
My code follows the project's code style (make format)
Linting passes (make lint)
Tests pass (make test)
Documentation updated (if needed)
No sensitive information (API keys, secrets) included

Related Issues

Closes #

Testing & Verification

Automated Tests

Unit tests added/updated
All existing tests pass

Manual Verification (if applicable)

Verified that unit tests cover all critical paths including:

Standard text synthesis.
Streaming synthesis (chunk generation).
Mocked voice cloning (prompt file handling).
Error handling for initialization and runtime failures.

API Endpoints Tested (if applicable)

Batch endpoint (POST /api/v1/stt/transcribe or /tts/synthesize)
SSE streaming (POST .../stream)
WebSocket (WS .../ws)

Engine-Specific Tests (if applicable)

Engine type: TTS
Provider: VoxCPM (OpenBMB)
Model: openbmb/VoxCPM-0.5B

Security Impact

No security implications
Security impact (please describe below)

- Add VoxCPM engine implementation (app/engines/tts/voxcpm/) - Update TTS router to support synthesis and streaming (app/api/routers/tts.py) - Register VoxCPM in engines.yaml - Add new dev dependencies in pyproject.toml - Add unit tests for VoxCPM engine (100% coverage) - Update TTS router unit tests

app/api/routers/tts.py

app/engines/tts/voxcpm/engine.py

pyproject.toml

Removed redundant TTSResponseModel and used direct TTSResponse return. Added @field_serializer to TTSResponse for explicit bytes-to-base64 conversion.

app/api/routers/tts.py

phonk2682 requested a review from minhsaco99 January 23, 2026 19:27

minhsaco99 requested changes Jan 24, 2026

View reviewed changes

app/api/routers/tts.py Outdated Show resolved Hide resolved

app/api/routers/tts.py Outdated Show resolved Hide resolved

app/engines/tts/voxcpm/engine.py Outdated Show resolved Hide resolved

pyproject.toml Show resolved Hide resolved

refactor(api): simplify TTS response and add explicit serialization

667490d

Removed redundant TTSResponseModel and used direct TTSResponse return. Added @field_serializer to TTSResponse for explicit bytes-to-base64 conversion.

phonk2682 requested a review from minhsaco99 January 24, 2026 11:13

Refactor: Move numpy_to_wav_bytes to AudioProcessor

7701661

minhsaco99 requested changes Jan 24, 2026

View reviewed changes

app/api/routers/tts.py Outdated Show resolved Hide resolved

refactor(tts): Return engine result directly

358b392

minhsaco99 requested changes Jan 24, 2026

View reviewed changes

app/api/routers/tts.py Show resolved Hide resolved

app/api/routers/tts.py Show resolved Hide resolved

Refactor: Use model_dump_json for TTS serialization

a118448

phonk2682 requested a review from minhsaco99 January 24, 2026 14:28

minhsaco99 approved these changes Jan 24, 2026

View reviewed changes

minhsaco99 merged commit 895e897 into main Jan 24, 2026
4 checks passed

minhsaco99 deleted the feature/add_voxcpm_engine branch January 24, 2026 15:14

minhsaco99 mentioned this pull request Jan 24, 2026

update voxcpm docs #14

Merged

20 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add VoxCPM TTS engine and update router#13

feat: add VoxCPM TTS engine and update router#13
minhsaco99 merged 5 commits intomainfrom
feature/add_voxcpm_engine

phonk2682 commented Jan 23, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

phonk2682 commented Jan 23, 2026

PR Description

Type of Change

Checklist

Related Issues

Testing & Verification

Automated Tests

Manual Verification (if applicable)

API Endpoints Tested (if applicable)

Engine-Specific Tests (if applicable)

Security Impact

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants