Project-specific agent instructions for TTS Adapter codebase.
Build & Run:
- Install:
make install - Run server:
make up(Docker) oruv run tts-server(local) - Test:
make testormake test batch - Health check:
make health
Architecture:
- Overview: docs/architecture.md
- API Reference: docs/api-reference.md
- Web UI (i18n, settings, LAN): docs/web-ui.md
- Qwen3 Engine: docs/engines/qwen3/README.md
Check universal rules: See ~/.claude/CLAUDE.md for global standards
Qwen3-TTS requires CUDA GPU. No CPU fallback available.
- RTX 4070 (12GB): Use 1.7B model with bf16
- Lower VRAM: Use 0.6B model
All engines must implement TTSEngine protocol:
warmup()- Load modelsynthesize(text, language, speaker, instruct)- Single textsynthesize_batch(texts, language, speaker, instruct)- Multiple texts
- Constructor args (programmatic)
- Environment variables (
.envfile) - Engine defaults (hardcoded)
tts-adapter/
├── tts_adapter/
│ ├── __init__.py
│ ├── api/ # FastAPI routes
│ │ ├── __init__.py
│ │ └── routes.py
│ ├── engines/ # TTS engines
│ │ ├── __init__.py # Engine factory
│ │ └── qwen3.py # Qwen3-TTS implementation
│ ├── cli.py # CLI entry point
│ ├── config.py # Global settings
│ ├── contract.py # Request/response models
│ └── engine.py # TTSEngine protocol
├── scripts/
│ └── tts_batch.py # Batch CLI tool
├── docs/ # Documentation
├── data/ # Local data (git-ignored)
│ └── cache/ # HuggingFace model cache
├── Dockerfile # Multi-stage GPU build
├── compose.yml # Docker Compose
├── Makefile # Build/test automation
├── pyproject.toml # uv config
└── uv.lock # Locked dependencies
- TTS: Qwen3-TTS (qwen-tts package)
- API: FastAPI, uvicorn
- Config: pydantic-settings
- Audio: soundfile
# Install
make install
source .venv/bin/activate
# Copy and edit config
cp .env.example .env
# Run server
uv run tts-server
# Test
make test# Build and start
make build
make up
# Check logs
make logs
# Test
make health
make test# Single generation
curl -X POST http://localhost:9880/tts \
-H 'content-type: application/json' \
-d '{"text":"Привет","language":"Russian","speaker":"Ryan"}' \
--output out.wav
# Batch generation
curl -X POST http://localhost:9880/tts/batch \
-H 'content-type: application/json' \
-d '{"items":[{"id":"001","text":"Первая"},{"id":"002","text":"Вторая"}]}' \
--output batch.zip| Variable | Default | Description |
|---|---|---|
TTS_ENGINE |
qwen3 |
Engine name |
TTS_DEFAULT_SPEAKER |
- | Default speaker |
TTS_DEFAULT_LANGUAGE |
- | Default language |
TTS_HOST |
0.0.0.0 |
Server host |
TTS_PORT |
9880 |
Server port |
TTS_QWEN3_MODEL_ID |
Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice |
Model |
TTS_QWEN3_DEVICE |
cuda:0 |
CUDA device |
TTS_QWEN3_DTYPE |
bfloat16 |
Data type |
- Create
tts_adapter/engines/new_engine.py - Implement
TTSEngineprotocol - Add engine-specific
Settingsclass with pydantic-settings - Register in
engines/__init__.py(_ENGINESdict) - Document in
docs/engines/new_engine.md
- Branches: Feature branches from
develop - PRs: Target
developbranch - See:
~/.claude/CLAUDE.mdfor universal git rules
- Universal rules:
~/.claude/CLAUDE.md - Architecture: docs/architecture.md
- API Reference: docs/api-reference.md