Skip to content

Conversation

@aaronsb
Copy link
Owner

@aaronsb aaronsb commented Dec 20, 2025

Summary

Adds AMD GPU (ROCm) support for GPU-accelerated local embeddings, complementing existing NVIDIA CUDA and Apple MPS support.

  • Two AMD GPU modes:

    • amd: Uses PyTorch ROCm wheels (for ROCm 6.x systems)
    • amd-host: Uses official ROCm Docker image (for ROCm 7.x systems)
  • The amd-host mode uses rocm/pytorch:rocm7.1 base image with:

    • ROCm 7.1 with full HIP/HSA support
    • PyTorch 2.9.1 with ROCm backend
    • Python 3.13
  • Key changes:

    • New docker-compose.gpu-amd.yml for ROCm wheels mode
    • New docker-compose.gpu-amd-host.yml for host ROCm mode
    • New api/Dockerfile.rocm-host using official ROCm image
    • Updated api/Dockerfile with PYTORCH_VARIANT build arg
    • Updated guided-init.sh with AMD GPU options (3 and 4)
    • Updated common.sh and start-app.sh for new GPU modes
    • Added ROCm documentation to .env.example
  • Bug fix: Empty HSA_OVERRIDE_GFX_VERSION env var breaks ROCm initialization. Removed from compose files with documentation that users can set in .env if needed.

Test Plan

  • Tested on AMD Radeon RX 7900 XTX with ROCm 7.1.1 host driver
  • Verified GPU detection: 🎮 CUDA GPU detected - AMD Radeon RX 7900 XTX (24.0GB VRAM)
  • Verified search queries use GPU embeddings
  • Verified ingestion uses GPU embeddings (11 embeddings generated)
  • Verified concept matching works via GPU similarity
  • Full end-to-end test: wrote article from graph concepts, ingested it, searched and found it

Add two AMD GPU modes:
- amd: Uses ROCm PyTorch wheels (for ROCm 6.x systems)
- amd-host: Uses official ROCm Docker image (for ROCm 7.x systems)

The amd-host mode uses rocm/pytorch:rocm7.1 base image with:
- ROCm 7.1 with full HIP/HSA support
- PyTorch 2.9.1 with ROCm backend
- Python 3.13

Key changes:
- New docker-compose.gpu-amd.yml for ROCm wheels mode
- New docker-compose.gpu-amd-host.yml for host ROCm mode
- New api/Dockerfile.rocm-host using official ROCm image
- Updated api/Dockerfile with PYTORCH_VARIANT build arg
- Updated guided-init.sh with AMD GPU options (3 and 4)
- Updated common.sh and start-app.sh for new GPU modes

Fixed: Empty HSA_OVERRIDE_GFX_VERSION env var breaks ROCm
initialization. Removed from compose files with documentation
note that users can set in .env if needed.

Tested on AMD Radeon RX 7900 XTX with ROCm 7.1.1 host driver.
@aaronsb aaronsb merged commit 63f7528 into main Dec 20, 2025
6 checks passed
@aaronsb aaronsb deleted the feature/amd-gpu-support branch December 20, 2025 03:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants