diff --git a/CLAUDE.md b/CLAUDE.md index 2e28dfdc..d912c9bb 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -152,7 +152,7 @@ For production deployments, migrations will be handled differently. - Environment config: `.env` (never overwrite without confirmation) - Database init: `database/init_db.sql` - Docker config: `docker-compose.yml` (development only) -- Production config: Generated by `setup-opentranscribe.sh` +- Production config: Generated by `setup-opentranscribe.sh` - Frontend build: `frontend/vite.config.ts` ## AI Processing Workflow @@ -196,7 +196,7 @@ The application now includes optional AI-powered features using Large Language M **Deployment Options:** - **Cloud-Only**: Use `.env` configuration with external providers (OpenAI, Claude, etc.) -- **Local vLLM**: Run `docker compose -f docker-compose.yml -f docker-compose.vllm.yml up` +- **Local vLLM**: Run `docker compose -f docker-compose.yml -f docker-compose.vllm.yml up` - **Local Ollama**: Uncomment ollama service in `docker-compose.vllm.yml` and use same command - **No LLM**: Leave LLM_PROVIDER empty for transcription-only mode @@ -224,8 +224,8 @@ ${MODEL_CACHE_DIR}/ The system uses simple volume mappings to cache models to their natural locations: ```yaml volumes: - - ${MODEL_CACHE_DIR}/huggingface:/root/.cache/huggingface - - ${MODEL_CACHE_DIR}/torch:/root/.cache/torch + - ${MODEL_CACHE_DIR}/huggingface:/home/appuser/.cache/huggingface + - ${MODEL_CACHE_DIR}/torch:/home/appuser/.cache/torch ``` ### Key Benefits @@ -234,6 +234,36 @@ volumes: - **User configurable**: Simple `.env` variable controls cache location - **No re-downloads**: Models cached after first download (2.5GB total) +## Security Features + +### Non-Root Container User + +OpenTranscribe backend containers run as a non-root user (`appuser`, UID 1000) following Docker security best practices. + +**Benefits:** +- Follows principle of least privilege +- Reduces security risk from container escape vulnerabilities +- Compliant with security scanning tools (Trivy, Snyk, etc.) +- Prevents host root compromise in case of container breach + +**Migration for Existing Deployments:** + +If you have an existing installation with model cache owned by root, run the permission fix script: + +```bash +# Fix permissions on existing model cache +./scripts/fix-model-permissions.sh +``` + +This script will change ownership of your model cache to UID:GID 1000:1000, making it accessible to the non-root container user. + +**Technical Details:** +- Container user: `appuser` (UID 1000, GID 1000) +- User groups: `appuser`, `video` (for GPU access) +- Cache directories: `/home/appuser/.cache/huggingface`, `/home/appuser/.cache/torch` +- Multi-stage build for minimal attack surface +- Health checks for container orchestration + ## Common Tasks ### Adding New API Endpoints @@ -254,4 +284,4 @@ volumes: 1. Modify `database/init_db.sql` 2. Update SQLAlchemy models 3. Update Pydantic schemas -4. Reset dev environment: `./opentr.sh reset dev` \ No newline at end of file +4. Reset dev environment: `./opentr.sh reset dev` diff --git a/backend/DOCKER_STRATEGY.md b/backend/DOCKER_STRATEGY.md deleted file mode 100644 index bf201d63..00000000 --- a/backend/DOCKER_STRATEGY.md +++ /dev/null @@ -1,188 +0,0 @@ -# Docker Build Strategy - OpenTranscribe Backend - -## Overview - -The OpenTranscribe backend uses two Docker build strategies optimized for different use cases: - -1. **Dockerfile.prod** - Standard production build (currently in use) -2. **Dockerfile.prod.optimized** - Multi-stage build for enhanced security (future use) - -## Current Configuration - -### Active Dockerfile: `Dockerfile.prod` - -**Base Image:** `python:3.12-slim-bookworm` (Debian 12) - -**Key Features:** -- ✅ Single-stage build for faster iteration -- ✅ CUDA 12.8 & cuDNN 9 compatibility -- ✅ Security updates (CVE-2025-32434 fixed) -- ✅ Root user (required for GPU access in development) - -**Used By:** -- `backend` service (docker-compose.yml:80) -- `celery-worker` service (docker-compose.yml:152) -- `flower` service (docker-compose.yml:254) - -### ML/AI Stack (All cuDNN 9 Compatible) - -| Package | Version | Notes | -|---------|---------|-------| -| PyTorch | 2.8.0+cu128 | CVE-2025-32434 fixed, CUDA 12.8 | -| CTranslate2 | ≥4.6.0 | cuDNN 9 support | -| WhisperX | 3.7.0 | Latest with ctranslate2 4.5+ support | -| PyAnnote Audio | ≥3.3.2 | PyTorch 2.6+ compatible | -| NumPy | ≥1.25.2 | 2.x compatible, no CVEs | - -### Critical Configuration - -**LD_LIBRARY_PATH** (Line 28): -```dockerfile -ENV LD_LIBRARY_PATH=/usr/local/lib/python3.12/site-packages/nvidia/cudnn/lib:/usr/local/lib/python3.12/site-packages/nvidia/cuda_runtime/lib -``` - -**Why This Matters:** -- PyAnnote diarization requires cuDNN 9 libraries -- Libraries are in Python package directory, not system path -- Without this, you get: `Unable to load libcudnn_cnn.so.9` → SIGABRT crash -- Must be set at Dockerfile level (persistent, can't be overridden) - -## Future Strategy: Optimized Build - -### Dockerfile.prod.optimized (Not Yet Active) - -**When to Use:** -- Production deployments requiring maximum security -- Environments that support non-root containers -- CI/CD pipelines with security scanning - -**Key Improvements:** - -1. **Multi-Stage Build** - - Stage 1 (builder): Compiles dependencies with build tools - - Stage 2 (runtime): Minimal image, only runtime dependencies - - Result: ~40% smaller image size - -2. **Non-Root User** - - Runs as `appuser` (UID 1000) - - Follows principle of least privilege - - Better for production security posture - -3. **Security Enhancements** - - No build tools in final image - - No curl/git (attack surface reduction) - - OCI-compliant labels for tracking - - Built-in health checks - -4. **Library Paths** (Adjusted for non-root) - ```dockerfile - ENV LD_LIBRARY_PATH=/home/appuser/.local/lib/python3.12/site-packages/nvidia/cudnn/lib:/home/appuser/.local/lib/python3.12/site-packages/nvidia/cuda_runtime/lib - ``` - -### Migration Path - -**Phase 1: Current** ✅ -- Using `Dockerfile.prod` (root user) -- Verified working with GPU/CUDA -- All services stable - -**Phase 2: Testing** (Next Step) -1. Test `Dockerfile.prod.optimized` with same workload -2. Verify GPU access works with non-root user -3. Confirm cuDNN libraries load correctly -4. Run full transcription pipeline test - -**Phase 3: Migration** -1. Update docker-compose.yml to use `Dockerfile.prod.optimized` -2. Update GPU device permissions if needed -3. Deploy to staging environment -4. Monitor for 48 hours -5. Production rollout - -## Troubleshooting - -### Common Issues - -**Problem:** `Unable to load libcudnn_cnn.so.9` -- **Cause:** LD_LIBRARY_PATH not set -- **Fix:** Ensure LD_LIBRARY_PATH in Dockerfile (not docker-compose) - -**Problem:** `Worker exited with SIGABRT` -- **Cause:** cuDNN library version mismatch -- **Fix:** Verify PyTorch 2.8.0+cu128 → cuDNN 9.10.2 - -**Problem:** GPU not accessible in optimized build -- **Cause:** Non-root user lacks GPU permissions -- **Fix:** Add user to `video` group or use `--privileged` - -## Development Workflow - -### Local Development (with venv) -```bash -cd backend/ -source venv/bin/activate -pip install -r requirements-dev.txt # Includes testing tools -``` - -### Container Testing -```bash -# Current production build -./opentr.sh start prod - -# Test optimized build (after migration) -docker compose -f docker-compose.yml -f docker-compose.optimized.yml up -``` - -### Building Images -```bash -# Standard build -docker compose build backend celery-worker flower - -# Optimized build (future) -docker compose build -f Dockerfile.prod.optimized backend -``` - -## Security Considerations - -### Current (Dockerfile.prod) -- ✅ Updated base image (Debian 12 Bookworm) -- ✅ CVE-2025-32434 fixed (PyTorch 2.8.0) -- ✅ Minimal package installation -- ⚠️ Runs as root (required for current GPU setup) - -### Future (Dockerfile.prod.optimized) -- ✅ All above, plus: -- ✅ Non-root user execution -- ✅ Multi-stage build (no build tools in runtime) -- ✅ Explicit OCI labels for compliance -- ✅ Health check integration - -## File Structure - -``` -backend/ -├── Dockerfile.prod # Current production (in use) -├── Dockerfile.prod.optimized # Future optimized build -├── requirements.txt # Production dependencies -├── requirements-dev.txt # Development tools -├── DOCKER_STRATEGY.md # This file -└── .dockerignore # Excludes venv, etc. -``` - -## Key Takeaways - -1. **Always use Dockerfile.prod for now** - verified working -2. **LD_LIBRARY_PATH is critical** - must be in Dockerfile -3. **cuDNN 9 compatibility** - all packages updated -4. **Optimized build is ready** - awaiting GPU permission testing -5. **No downgrade needed** - NumPy 2.x works perfectly - -## Change History - -- **2025-10-11**: Initial strategy with cuDNN 9 migration - - Updated PyTorch 2.2.2 → 2.8.0+cu128 - - Updated CTranslate2 4.4.0 → 4.6.0 - - Updated WhisperX 3.4.3 → 3.7.0 - - Fixed LD_LIBRARY_PATH for cuDNN libraries - - Removed obsolete Dockerfile.dev variants - - Created Dockerfile.prod.optimized for future use diff --git a/backend/Dockerfile.prod b/backend/Dockerfile.prod index 5766cc79..36bb99af 100644 --- a/backend/Dockerfile.prod +++ b/backend/Dockerfile.prod @@ -1,17 +1,24 @@ -FROM python:3.12-slim-bookworm +# ============================================================================= +# OpenTranscribe Backend - Production Dockerfile +# Multi-stage build optimized for security with non-root user +# Updated with cuDNN 9 compatibility for PyTorch 2.8.0+cu128 +# ============================================================================= -WORKDIR /app +# ----------------------------------------------------------------------------- +# Stage 1: Build Stage - Install Python dependencies with compilation +# ----------------------------------------------------------------------------- +FROM python:3.12-slim-bookworm AS builder + +WORKDIR /build -# Install system dependencies -RUN apt-get update && apt-get install -y \ +# Install build dependencies (only in this stage) +RUN apt-get update && apt-get install -y --no-install-recommends \ build-essential \ - curl \ - ffmpeg \ - libsndfile1 \ - libimage-exiftool-perl \ + gcc \ + g++ \ && rm -rf /var/lib/apt/lists/* -# Copy requirements file +# Copy only requirements first for better layer caching COPY requirements.txt . # Install Python dependencies @@ -20,20 +27,69 @@ COPY requirements.txt . # CTranslate2 4.6.0+ - cuDNN 9 support # WhisperX 3.7.0 - latest version with ctranslate2 4.5+ compatibility # NumPy 2.x - fully compatible with all packages, no security issues -RUN pip install --no-cache-dir -r requirements.txt +# Use --user to install to /root/.local which we'll copy to final stage +RUN pip install --user --no-cache-dir --no-warn-script-location -r requirements.txt + +# ----------------------------------------------------------------------------- +# Stage 2: Runtime Stage - Minimal production image with non-root user +# ----------------------------------------------------------------------------- +FROM python:3.12-slim-bookworm + +# OCI annotations for container metadata and compliance +LABEL org.opencontainers.image.title="OpenTranscribe Backend" \ + org.opencontainers.image.description="AI-powered transcription backend with WhisperX and PyAnnote" \ + org.opencontainers.image.vendor="OpenTranscribe" \ + org.opencontainers.image.authors="OpenTranscribe Contributors" \ + org.opencontainers.image.licenses="MIT" \ + org.opencontainers.image.source="https://github.com/davidamacey/OpenTranscribe" \ + org.opencontainers.image.documentation="https://github.com/davidamacey/OpenTranscribe/blob/master/README.md" + +# Install only runtime dependencies (no build tools) +RUN apt-get update && apt-get install -y --no-install-recommends \ + curl \ + ffmpeg \ + libsndfile1 \ + libimage-exiftool-perl \ + libgomp1 \ + && rm -rf /var/lib/apt/lists/* \ + && apt-get clean + +# Create non-root user for security +# Add to video group for GPU access +RUN groupadd -r appuser && \ + useradd -r -g appuser -G video -u 1000 -m -s /bin/bash appuser && \ + mkdir -p /app /app/models /app/temp && \ + chown -R appuser:appuser /app +# Set working directory +WORKDIR /app + +# Copy Python packages from builder stage +COPY --from=builder --chown=appuser:appuser /root/.local /home/appuser/.local + +# Ensure scripts in .local are usable by adding to PATH # Set LD_LIBRARY_PATH for cuDNN libraries from PyTorch package # This ensures PyAnnote and other tools can find cuDNN 9 libraries -# Must be set at build time to persist in the container -ENV LD_LIBRARY_PATH=/usr/local/lib/python3.12/site-packages/nvidia/cudnn/lib:/usr/local/lib/python3.12/site-packages/nvidia/cuda_runtime/lib - -# Create directories for models and temporary files -RUN mkdir -p /app/models /app/temp +# Set cache directories to user home +ENV PATH=/home/appuser/.local/bin:$PATH \ + PYTHONUNBUFFERED=1 \ + PYTHONDONTWRITEBYTECODE=1 \ + LD_LIBRARY_PATH=/home/appuser/.local/lib/python3.12/site-packages/nvidia/cudnn/lib:/home/appuser/.local/lib/python3.12/site-packages/nvidia/cuda_runtime/lib \ + HF_HOME=/home/appuser/.cache/huggingface \ + TRANSFORMERS_CACHE=/home/appuser/.cache/huggingface/transformers \ + TORCH_HOME=/home/appuser/.cache/torch # Copy application code -COPY . . +COPY --chown=appuser:appuser . . + +# Switch to non-root user +USER appuser + +# Health check for container orchestration +HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \ + CMD curl -f http://localhost:8080/health || exit 1 -# Expose port +# Expose application port EXPOSE 8080 # Command to run the application in production (no reload) diff --git a/backend/Dockerfile.prod.optimized b/backend/Dockerfile.prod.optimized deleted file mode 100644 index 3c2f9889..00000000 --- a/backend/Dockerfile.prod.optimized +++ /dev/null @@ -1,90 +0,0 @@ -# ============================================================================= -# OpenTranscribe Backend - Production Dockerfile (Optimized) -# Multi-stage build optimized for security and minimal image size -# Updated with cuDNN 9 compatibility for PyTorch 2.8.0+cu128 -# ============================================================================= - -# ----------------------------------------------------------------------------- -# Stage 1: Build Stage - Install Python dependencies with compilation -# ----------------------------------------------------------------------------- -FROM python:3.12-slim-bookworm AS builder - -WORKDIR /build - -# Install build dependencies (only in this stage) -RUN apt-get update && apt-get install -y --no-install-recommends \ - build-essential \ - gcc \ - g++ \ - && rm -rf /var/lib/apt/lists/* - -# Copy only requirements first for better layer caching -COPY requirements.txt . - -# Install Python dependencies -# All packages now use cuDNN 9 for CUDA 12.8 compatibility -# PyTorch 2.8.0+cu128 - includes CVE-2025-32434 security fix -# CTranslate2 4.6.0+ - cuDNN 9 support -# WhisperX 3.7.0 - latest version with ctranslate2 4.5+ compatibility -# NumPy 2.x - fully compatible with all packages, no security issues -# Use --user to install to /root/.local which we'll copy to final stage -RUN pip install --user --no-cache-dir --no-warn-script-location -r requirements.txt - -# ----------------------------------------------------------------------------- -# Stage 2: Runtime Stage - Minimal production image -# ----------------------------------------------------------------------------- -FROM python:3.12-slim-bookworm - -# OCI annotations for metadata -LABEL org.opencontainers.image.title="OpenTranscribe Backend" \ - org.opencontainers.image.description="AI-powered transcription backend with WhisperX and PyAnnote" \ - org.opencontainers.image.vendor="OpenTranscribe" \ - org.opencontainers.image.authors="OpenTranscribe Contributors" \ - org.opencontainers.image.licenses="MIT" \ - org.opencontainers.image.source="https://github.com/yourusername/transcribe-app" \ - org.opencontainers.image.documentation="https://github.com/yourusername/transcribe-app/blob/main/README.md" - -# Install only runtime dependencies (no build tools, no git, no curl) -RUN apt-get update && apt-get install -y --no-install-recommends \ - ffmpeg \ - libsndfile1 \ - libimage-exiftool-perl \ - libgomp1 \ - && rm -rf /var/lib/apt/lists/* \ - && apt-get clean - -# Create non-root user for security -RUN groupadd -r appuser && \ - useradd -r -g appuser -u 1000 -m -s /bin/bash appuser && \ - mkdir -p /app /app/models /app/temp && \ - chown -R appuser:appuser /app - -# Set working directory -WORKDIR /app - -# Copy Python packages from builder stage -COPY --from=builder --chown=appuser:appuser /root/.local /home/appuser/.local - -# Ensure scripts in .local are usable by adding to PATH -# Set LD_LIBRARY_PATH for cuDNN libraries from PyTorch package -# This ensures PyAnnote and other tools can find cuDNN 9 libraries -ENV PATH=/home/appuser/.local/bin:$PATH \ - PYTHONUNBUFFERED=1 \ - PYTHONDONTWRITEBYTECODE=1 \ - LD_LIBRARY_PATH=/home/appuser/.local/lib/python3.12/site-packages/nvidia/cudnn/lib:/home/appuser/.local/lib/python3.12/site-packages/nvidia/cuda_runtime/lib - -# Copy application code -COPY --chown=appuser:appuser . . - -# Switch to non-root user -USER appuser - -# Health check for container orchestration -HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \ - CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health').read()" || exit 1 - -# Expose application port -EXPOSE 8080 - -# Run application with auto-scaling workers (Uvicorn detects CPU cores) -CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8080"] diff --git a/backend/README.md b/backend/README.md index 8e39c02a..0ba1b33d 100644 --- a/backend/README.md +++ b/backend/README.md @@ -1,6 +1,6 @@
-
+
# Backend