Skip to content

feat: add VoxCPM TTS engine and update router#13

Merged
minhsaco99 merged 5 commits intomainfrom
feature/add_voxcpm_engine
Jan 24, 2026
Merged

feat: add VoxCPM TTS engine and update router#13
minhsaco99 merged 5 commits intomainfrom
feature/add_voxcpm_engine

Conversation

@phonk2682
Copy link
Copy Markdown
Collaborator

PR Description

This PR implements the VoxCPM TTS (Text-to-Speech) engine specifically aimed at providing high-quality, tokenizer-free speech synthesis with voice cloning capabilities.

Key changes include:

  • New Engine Implementation: Added VoxCPMEngine in app/engines/tts/voxcpm/ implementing the BaseTTSEngine interface. Supports both batch synthesis and streaming (with low latency).
  • Router Upgrade: Updated app/api/routers/tts.py from a placeholder stub to a fully functional router that supports:
    • POST /synthesize: Returns base64 encoded WAV audio.
    • POST /synthesize/stream: Server-Sent Events (SSE) streaming of audio chunks.
    • Request validation and error handling.
  • Configuration: Registered voxcpm in engines.yaml and added necessary dependencies (voxcpm, torchcodec for cloning) to pyproject.toml.
  • Testing:
    • Added comprehensive unit tests for VoxCPMEngine covering lifecycle, synthesis, streaming, and error edges (Achieved 100% coverage in app/engines/tts/voxcpm/engine.py).
    • Updated test_tts_router.py to test the actual router logic instead of expecting 501 errors, aligning test_tts_router.py coverage to 100%.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • New Engine (STT/TTS provider)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (non-breaking code cleanup or optimization)
  • Documentation update
  • Performance improvement

Checklist

  • I have read the CONTRIBUTING guide
  • My code follows the project's code style (make format)
  • Linting passes (make lint)
  • Tests pass (make test)
  • Documentation updated (if needed)
  • No sensitive information (API keys, secrets) included

Related Issues

Closes #

Testing & Verification

Automated Tests

  • Unit tests added/updated
  • All existing tests pass

Manual Verification (if applicable)

Verified that unit tests cover all critical paths including:

  • Standard text synthesis.
  • Streaming synthesis (chunk generation).
  • Mocked voice cloning (prompt file handling).
  • Error handling for initialization and runtime failures.

API Endpoints Tested (if applicable)

  • Batch endpoint (POST /api/v1/stt/transcribe or /tts/synthesize)
  • SSE streaming (POST .../stream)
  • WebSocket (WS .../ws)

Engine-Specific Tests (if applicable)

  • Engine type: TTS
  • Provider: VoxCPM (OpenBMB)
  • Model: openbmb/VoxCPM-0.5B

Security Impact

  • No security implications
  • Security impact (please describe below)

- Add VoxCPM engine implementation (app/engines/tts/voxcpm/)
- Update TTS router to support synthesis and streaming (app/api/routers/tts.py)
- Register VoxCPM in engines.yaml
- Add new dev dependencies in pyproject.toml
- Add unit tests for VoxCPM engine (100% coverage)
- Update TTS router unit tests
@phonk2682 phonk2682 requested a review from minhsaco99 January 23, 2026 19:27
Removed redundant TTSResponseModel and used direct TTSResponse return. Added @field_serializer to TTSResponse for explicit bytes-to-base64 conversion.
@phonk2682 phonk2682 requested a review from minhsaco99 January 24, 2026 11:13
@phonk2682 phonk2682 requested a review from minhsaco99 January 24, 2026 14:28
@minhsaco99 minhsaco99 merged commit 895e897 into main Jan 24, 2026
4 checks passed
@minhsaco99 minhsaco99 deleted the feature/add_voxcpm_engine branch January 24, 2026 15:14
@minhsaco99 minhsaco99 mentioned this pull request Jan 24, 2026
20 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants