Skip to content

sergio-caracas/voicebox-docker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

272 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Voicebox

Voicebox — Docker Edition

A Docker containerization fork of jamiepine/voicebox for NVIDIA GPU users on Windows.
Runs the full app (backend + web UI) in a single container with CUDA acceleration.

Upstream Release License CUDA Docker


⚠️ This is not the original project.
This fork adds Docker support for NVIDIA GPU users. If you want the native desktop app for macOS or Windows, go to jamiepine/voicebox.


What this fork adds

Original This fork
Deployment Tauri desktop app Docker container
Platform macOS / Windows native Windows + Docker Desktop
GPU MLX (Apple Silicon) / PyTorch CUDA via NVIDIA GPU
Web UI Embedded in desktop app Served by FastAPI at port 17493
Setup Install app + Python docker-compose up

Changes made to the upstream source:

  • Added Dockerfile — multi-stage build (Bun frontend → Python runtime)
  • Added docker-compose.yml — base CPU service definition
  • Added docker-compose.cuda.yml — NVIDIA GPU overlay
  • Modified backend/main.py — serves the React web UI from the same FastAPI port
  • Updated .gitignore — excludes model weights, local data, and generated audio

Requirements

  • Docker Desktop with WSL2 backend enabled
  • Windows 10/11
  • NVIDIA GPU (tested on RTX 5070 Ti, 12 GB VRAM)
  • Docker Desktop → Settings → Resources → Enable GPU

CPU-only mode works but TTS generation will be very slow.


Quick start

git clone https://github.com/sergio-caracas/voicebox-docker.git

cd voicebox-docker

# Start with NVIDIA GPU acceleration (recommended)
docker-compose -f docker-compose.yml -f docker-compose.cuda.yml up -d

Open http://localhost:17493 in your browser.

First run: The Qwen3-TTS model (~4 GB) downloads automatically on your first generation request.
It is cached in a Docker volume and will not re-download on subsequent starts.


Access points

URL Description
http://localhost:17493 Web UI
http://localhost:17493/docs FastAPI interactive API docs
http://localhost:17493/health Health check

Stopping and restarting

# Stop (data is preserved in Docker volumes)
docker-compose down

# Restart with GPU
docker-compose -f docker-compose.yml -f docker-compose.cuda.yml up -d

# Rebuild after code changes
docker-compose -f docker-compose.yml -f docker-compose.cuda.yml up -d --build

Full documentation

See README_DOCKER.md for complete instructions:

  • Volume management and data persistence
  • Environment variables
  • GPU verification
  • Troubleshooting

Credits

All credit for the original application goes to Jamie Pine and contributors.
This fork only adds containerization. The core app, AI models, and all features are from the upstream project.


voicebox.sh

About

The open-source voice synthesis studio powered by Qwen3-TTS.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • TypeScript 57.8%
  • Python 31.8%
  • Rust 8.1%
  • Makefile 0.9%
  • CSS 0.6%
  • Dockerfile 0.5%
  • Other 0.3%