This guide will help you configure your Windows system with WSL2 to use NVIDIA GPUs with Docker for accelerated speech transcription and translation.
Open Command Prompt or PowerShell and run:
nvidia-smiYou should see output similar to:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.10 Driver Version: 581.29 CUDA Version: 13.0 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| 0 NVIDIA GeForce RTX 5070 Ti On | 00000000:01:00.0 On | N/A |
+-----------------------------------------+------------------------+----------------------+
If nvidia-smi is not found:
- Download and install the latest NVIDIA drivers from NVIDIA's website
- Reboot your system after installation
- Windows 10/11 (64-bit) version 21H2 or higher
- WSL2 installed and configured
- Docker Desktop for Windows with WSL2 backend
- NVIDIA GPU with driver version 470.76 or higher
Open PowerShell as Administrator and run:
# Install WSL2 with Ubuntu
wsl --install -d Ubuntu-22.04
# Set WSL2 as default
wsl --set-default-version 2
# Verify installation
wsl --list --verboseExpected output:
NAME STATE VERSION
* Ubuntu-22.04 Running 2
Important: WSL2 uses the Windows NVIDIA driver - do NOT install NVIDIA drivers inside WSL!
- Download Docker Desktop from docker.com/products/docker-desktop
- Run the installer
- During installation, ensure "Use WSL 2 instead of Hyper-V" is selected
- Restart your computer
- Open Docker Desktop
- Go to Settings → General
- Ensure "Use the WSL 2 based engine" is checked
- Go to Settings → Resources → WSL Integration
- Enable integration for your Ubuntu distribution
- Click "Apply & Restart"
Open your WSL terminal (Ubuntu) and run:
# Add NVIDIA Container Toolkit repository
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
# Install NVIDIA Container Toolkit
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
# Configure Docker to use NVIDIA runtime
sudo nvidia-ctk runtime configure --runtime=docker- Right-click the Docker Desktop icon in the system tray
- Select "Restart Docker Desktop"
- Wait for Docker to fully restart
Open PowerShell as Administrator and run:
wsl --shutdownThen reopen your WSL terminal. Docker Desktop will auto-start.
# Test if Docker can see your GPU
docker run --rm --gpus all nvidia/cuda:12.0.0-base-ubuntu22.04 nvidia-smiExpected output: You should see your GPU listed (same as when you run nvidia-smi in PowerShell)
If successful, you should see something like:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.82.10 Driver Version: 581.29 CUDA Version: 13.0 |
+-----------------------------------------+------------------------+----------------------+
| 0 NVIDIA GeForce RTX 5070 Ti On | 00000000:01:00.0 On | N/A |
+-----------------------------------------+------------------------+----------------------+
Navigate to your project directory in WSL:
cd /mnt/d/Team/Transcription-Translation1/backend/infra
docker compose up --buildSTT Worker logs:
✓ Device: cuda
✓ Loading Faster-Whisper model: large-v3
✓ Model loaded successfully on GPU
Translation Worker logs:
✓ Device: cuda (FORCE_CPU=False)
✓ Loading NLLB model: facebook/nllb-200-distilled-600M
✓ Model loaded successfully on GPU
Cause: Docker doesn't have NVIDIA runtime configured.
Solution:
- Check if
/etc/docker/daemon.jsonexists in WSL:
cat /etc/docker/daemon.json- If missing or incorrect, create/edit it:
sudo nano /etc/docker/daemon.jsonAdd this content:
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}- Restart Docker Desktop or WSL
Cause: Windows NVIDIA driver and WSL kernel mismatch.
Solution:
- Update Windows NVIDIA drivers from NVIDIA's website
- Update WSL kernel:
# In PowerShell as Administrator
wsl --update
wsl --shutdown- Reboot Windows
Cause: Docker image uses newer CUDA than your driver supports.
Solution:
Option A: Update NVIDIA Drivers (recommended)
Download and install the latest drivers from NVIDIA's website.
Option B: Use CPU Mode Temporarily
Edit backend/infra/.env:
# STT Worker
DEVICE=cpu
# Translation Worker
FORCE_CPU=trueRestart services:
docker compose down
docker compose up --buildSolution:
- Ensure Docker Desktop is fully updated
- Go to Settings → Resources → WSL Integration
- Toggle off and on your Ubuntu distribution
- Click "Apply & Restart"
Cause: Drive not mounted in WSL.
Solution:
- Check mounted drives:
ls /mnt/- If
dis missing, restart WSL:
# In PowerShell
wsl --shutdown- Reopen WSL terminal
| Model | Device | Speed (RTF*) | Latency | Memory |
|---|---|---|---|---|
| Whisper large-v3 | CPU (16 cores) | 1.5x | ~3000ms | 4GB RAM |
| Whisper large-v3 | RTX 3060 | 0.2x | ~400ms | 6GB VRAM |
| Whisper large-v3 | RTX 5070 Ti | 0.1x | ~200ms | 6GB VRAM |
| NLLB-600M | CPU (16 cores) | - | ~150ms | 2GB RAM |
| NLLB-600M | GPU | - | ~50ms | 2GB VRAM |
*RTF = Real-Time Factor (lower is better; 1.0 = real-time)
GPU acceleration provides:
- ⚡ 10-15x faster transcription
- ⚡ 3x faster translation
- ⚡ Sub-200ms end-to-end latency
Edit backend/infra/.env:
# === Enable GPU (default) ===
DEVICE=cuda
FORCE_CPU=false
# === Disable GPU (fallback to CPU) ===
DEVICE=cpu
FORCE_CPU=true# For 4GB VRAM
MODEL_SIZE=small
# For 8GB VRAM
MODEL_SIZE=medium
# For 12GB+ VRAM
MODEL_SIZE=large-v3If you have multiple GPUs, assign different workers to different GPUs:
Edit backend/infra/docker-compose.yml:
services:
stt_worker:
environment:
- CUDA_VISIBLE_DEVICES=0 # Use first GPU
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [gpu]
translation_worker:
environment:
- CUDA_VISIBLE_DEVICES=1 # Use second GPU
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['1']
capabilities: [gpu]# Check GPU on Windows
nvidia-smi
# Test Docker GPU access
docker run --rm --gpus all nvidia/cuda:12.0.0-base-ubuntu22.04 nvidia-smi
# Navigate to project (adjust drive letter if needed)
cd /mnt/d/Team/Transcription-Translation1/backend/infra
# Start services
docker compose up --build
# Monitor GPU usage (in Windows PowerShell)
nvidia-smi -l 1 # Updates every 1 second
# Check service logs
docker compose logs -f stt_worker
docker compose logs -f translation_worker
# Restart services
docker compose down
docker compose up --build- NVIDIA CUDA on WSL2 User Guide
- Docker Desktop WSL2 Backend
- NVIDIA Container Toolkit Documentation
- WSL Documentation
Need Help? Open an issue on GitHub with:
- Your
nvidia-smioutput (from PowerShell) - Your
wsl --list --verboseoutput - Your
docker compose logsoutput - Docker Desktop version and settings screenshot