Skip to content

feat: Intel Arc (XPU) GPU support#320

Merged
jamiepine merged 2 commits intomainfrom
feat/intel-xpu-support
Mar 21, 2026
Merged

feat: Intel Arc (XPU) GPU support#320
jamiepine merged 2 commits intomainfrom
feat/intel-xpu-support

Conversation

@jamiepine
Copy link
Owner

@jamiepine jamiepine commented Mar 18, 2026

Summary

Adds Intel Arc GPU (XPU) support so that users with Intel Arc A-series GPUs (e.g. A770, A750) can use hardware acceleration instead of falling back to CPU.

Prompted by user feedback: an Intel Arc A770 16GB owner installed intel-extension-for-pytorch system-wide, but voicebox's venv didn't include it, and most backends didn't check for XPU anyway.

Changes

  • justfile — Windows setup-python now auto-detects Intel Arc GPUs (via Win32_VideoController) and installs torch from the XPU wheel index + intel-extension-for-pytorch, mirroring the existing NVIDIA/CUDA auto-detection
  • backend/backends/base.py — Added two shared helpers:
    • empty_device_cache(device) — clears VRAM on CUDA or XPU after model unload
    • manual_seed(seed, device) — sets reproducible seeds on CPU + CUDA or XPU
  • All TTS backends now pass allow_xpu=True to get_torch_device():
    • chatterbox_backend.py
    • chatterbox_turbo_backend.py
    • hume_backend.py (also enables bf16 on XPU — Intel Arc supports it natively)
    • luxtts_backend.py
    • pytorch_backend.py (already had allow_xpu=True, now uses shared helpers)
  • Replaced all inline torch.cuda.empty_cache() / torch.cuda.manual_seed() calls with the shared helpers so XPU gets the same treatment

Device priority

The existing priority in get_torch_device() is unchanged: CUDA > XPU > DirectML > MPS > CPU. Users with both NVIDIA and Intel GPUs will still use CUDA.

Testing notes

  • Requires an Intel Arc GPU + oneAPI toolchain to test XPU path end-to-end
  • CPU and CUDA paths are unchanged (the helpers are no-ops for non-matching devices)
  • The setup-python detection uses Intel.*Arc regex, so Intel integrated GPUs (UHD, Iris) won't trigger the XPU install

Summary by CodeRabbit

  • New Features

    • Added Intel Arc (XPU) detection and optional installation path; UI/backend now report XPU devices where available.
    • Health/status reporting now accounts for XPU memory usage when present.
  • Chores

    • Unified device-aware cache clearing and seed initialization across backends for consistent behavior on CPU/CUDA/XPU.
    • Refined device selection logic and unload handling for more robust resource management.

Auto-detect Intel Arc GPUs during Windows setup and install PyTorch
with XPU support + intel-extension-for-pytorch. Enable allow_xpu=True
on all TTS backends (Chatterbox, Chatterbox Turbo, Hume TADA, LuxTTS)
that previously only supported CUDA. Add shared empty_device_cache()
and manual_seed() helpers in base.py to handle XPU memory management
and reproducible seeding alongside CUDA.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 18, 2026

📝 Walkthrough

Walkthrough

Adds two device-agnostic utilities (empty_device_cache, manual_seed), applies them across multiple backends to replace direct CUDA/XPU calls, expands XPU (Intel Arc) support and detection in runtime and install scripts, and updates health/status reporting to account for XPU devices.

Changes

Cohort / File(s) Summary
Device utilities
backend/backends/base.py
Added empty_device_cache(device: str) and manual_seed(seed: int, device: str) to centralize cache eviction and RNG seeding across CPU/CUDA/XPU.
Backends — general updates
backend/backends/chatterbox_backend.py, backend/backends/chatterbox_turbo_backend.py, backend/backends/pytorch_backend.py, backend/backends/luxtts_backend.py
Imported and used the new utilities; replaced direct torch.cuda cache/seeding calls with empty_device_cache/manual_seed; enabled XPU in device selection (allow_xpu=True); minor refactors (sample rate retrieval, device state handling).
Hume backend (more changes)
backend/backends/hume_backend.py
Enabled XPU selection and bf16 usage on XPU, switched cache/seeding to new utilities, patched tokenizer path handling, and simplified several initialization/encoding expressions.
Runtime GPU detection
backend/app.py
Added Intel XPU detection using intel_extension_for_pytorch / torch.xpu, returning an "XPU (<device_name>)" status when available with guarded fallbacks.
Health/status reporting
backend/routes/health.py
Reported VRAM used for XPU with guarded torch.xpu calls and adjusted default backend_variant logic to prefer xpu when CUDA is absent but XPU is present.
Windows install/build script
justfile
Expanded GPU detection to detect Intel Arc GPUs and added an installation branch to install PyTorch/XPU + IPEX wheels when Intel Arc is present; retains CUDA and CPU fallbacks.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰
I hopped through code with nimble feet,
Cleared the caches, tuned the beat,
Seeds sown neat on CPU and XPU,
Intel Arc and friends hop too!
A carrot-shaped deploy — woohoo!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.06% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: Intel Arc (XPU) GPU support' accurately and clearly summarizes the main objective of the changeset—adding support for Intel Arc GPUs through XPU.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/intel-xpu-support
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
backend/backends/chatterbox_backend.py (1)

200-201: ⚠️ Potential issue | 🟡 Minor

Missing XPU-aware seeding in generation.

The generate() method uses torch.manual_seed(seed) directly, which only seeds the CPU. For XPU reproducibility, this should use the shared manual_seed(seed, device) helper like pytorch_backend.py and hume_backend.py do.

🔧 Proposed fix
         def _generate_sync():
             import torch

             if seed is not None:
-                torch.manual_seed(seed)
+                manual_seed(seed, self._device)

Also add the import at the top:

 from .base import (
     is_model_cached,
     get_torch_device,
     empty_device_cache,
+    manual_seed,
     combine_voice_prompts as _combine_voice_prompts,
     model_load_progress,
     patch_chatterbox_f32,
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/chatterbox_backend.py` around lines 200 - 201, The
generate() method currently calls torch.manual_seed(seed) which only seeds the
CPU; update it to use the shared helper manual_seed(seed, device) used in
pytorch_backend.py and hume_backend.py so XPU devices are seeded correctly for
reproducibility. Import the manual_seed helper at top of chatterbox_backend.py
and replace torch.manual_seed(seed) inside generate() with manual_seed(seed,
device) where device is the model/device used for generation (ensure you
reference the same device variable/name used in generate()).
backend/backends/luxtts_backend.py (1)

168-171: ⚠️ Potential issue | 🟡 Minor

Missing XPU-aware seeding in generation.

The seeding logic uses torch.manual_seed() and torch.cuda.manual_seed() directly, missing XPU support. This should use the shared manual_seed(seed, self.device) helper for consistency with the rest of the PR.

🔧 Proposed fix
         def _generate_sync():
             import torch

             if seed is not None:
-                torch.manual_seed(seed)
-                if torch.cuda.is_available():
-                    torch.cuda.manual_seed(seed)
+                manual_seed(seed, self.device)

And add the import:

 from .base import (
     is_model_cached,
     get_torch_device,
     empty_device_cache,
+    manual_seed,
     combine_voice_prompts as _combine_voice_prompts,
     model_load_progress,
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/luxtts_backend.py` around lines 168 - 171, Replace the
direct torch seeding calls in the generation path with the shared helper so XPU
devices are handled consistently: instead of torch.manual_seed(seed) /
torch.cuda.manual_seed(seed) use manual_seed(seed, self.device) (the same helper
used elsewhere in this PR), and ensure the manual_seed symbol is imported at the
top of the module; update the block in LuxttsBackend (the seed handling near
generation) to call manual_seed(seed, self.device) when seed is not None.
backend/backends/chatterbox_turbo_backend.py (1)

181-182: ⚠️ Potential issue | 🟡 Minor

Missing XPU-aware seeding in generation (same as chatterbox_backend.py).

For consistency and XPU reproducibility, use manual_seed(seed, self._device) instead of torch.manual_seed(seed).

🔧 Proposed fix
         def _generate_sync():
             import torch

             if seed is not None:
-                torch.manual_seed(seed)
+                manual_seed(seed, self._device)

And add the import:

 from .base import (
     is_model_cached,
     get_torch_device,
     empty_device_cache,
+    manual_seed,
     combine_voice_prompts as _combine_voice_prompts,
     model_load_progress,
     patch_chatterbox_f32,
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/chatterbox_turbo_backend.py` around lines 181 - 182, Replace
the call to torch.manual_seed(seed) with a device-aware seeding helper by
calling manual_seed(seed, self._device) (use the existing self._device in the
class/method where the seed is set), and add the required import for manual_seed
at the top of the module; locate the occurrence around the code that currently
calls torch.manual_seed(seed) and update it to manual_seed(seed, self._device).
🧹 Nitpick comments (2)
backend/backends/base.py (2)

129-141: Consider adding torch.xpu.is_available() guard for XPU cache clearing.

The CUDA branch checks torch.cuda.is_available(), but the XPU branch only checks hasattr(torch, "xpu"). For consistency and robustness, consider also verifying torch.xpu.is_available() before calling empty_cache().

♻️ Suggested improvement
 def empty_device_cache(device: str) -> None:
     ...
     import torch

     if device == "cuda" and torch.cuda.is_available():
         torch.cuda.empty_cache()
-    elif device == "xpu" and hasattr(torch, "xpu"):
+    elif device == "xpu" and hasattr(torch, "xpu") and torch.xpu.is_available():
         torch.xpu.empty_cache()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/base.py` around lines 129 - 141, The XPU branch in
empty_device_cache currently only checks hasattr(torch, "xpu") before calling
torch.xpu.empty_cache(); update it to also verify torch.xpu.is_available()
(similar to the CUDA branch's torch.cuda.is_available()) and only call
torch.xpu.empty_cache() when both the attribute exists and is_available()
returns True to avoid calling into unavailable XPU APIs.

144-157: Same suggestion for manual_seed XPU branch.

For symmetry with the CUDA check, add torch.xpu.is_available() guard.

♻️ Suggested improvement
 def manual_seed(seed: int, device: str) -> None:
     ...
     torch.manual_seed(seed)
     if device == "cuda" and torch.cuda.is_available():
         torch.cuda.manual_seed(seed)
-    elif device == "xpu" and hasattr(torch, "xpu"):
+    elif device == "xpu" and hasattr(torch, "xpu") and torch.xpu.is_available():
         torch.xpu.manual_seed(seed)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/base.py` around lines 144 - 157, The XPU branch in
manual_seed lacks an availability check like the CUDA branch, so update
manual_seed to guard the XPU seed call by checking torch.xpu.is_available() (in
addition to hasattr(torch, "xpu")) before calling torch.xpu.manual_seed(seed);
locate the manual_seed function in backend/backends/base.py and modify the XPU
branch to mirror the CUDA guard to avoid calling xpu APIs when no XPU is
present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@backend/backends/chatterbox_backend.py`:
- Around line 200-201: The generate() method currently calls
torch.manual_seed(seed) which only seeds the CPU; update it to use the shared
helper manual_seed(seed, device) used in pytorch_backend.py and hume_backend.py
so XPU devices are seeded correctly for reproducibility. Import the manual_seed
helper at top of chatterbox_backend.py and replace torch.manual_seed(seed)
inside generate() with manual_seed(seed, device) where device is the
model/device used for generation (ensure you reference the same device
variable/name used in generate()).

In `@backend/backends/chatterbox_turbo_backend.py`:
- Around line 181-182: Replace the call to torch.manual_seed(seed) with a
device-aware seeding helper by calling manual_seed(seed, self._device) (use the
existing self._device in the class/method where the seed is set), and add the
required import for manual_seed at the top of the module; locate the occurrence
around the code that currently calls torch.manual_seed(seed) and update it to
manual_seed(seed, self._device).

In `@backend/backends/luxtts_backend.py`:
- Around line 168-171: Replace the direct torch seeding calls in the generation
path with the shared helper so XPU devices are handled consistently: instead of
torch.manual_seed(seed) / torch.cuda.manual_seed(seed) use manual_seed(seed,
self.device) (the same helper used elsewhere in this PR), and ensure the
manual_seed symbol is imported at the top of the module; update the block in
LuxttsBackend (the seed handling near generation) to call manual_seed(seed,
self.device) when seed is not None.

---

Nitpick comments:
In `@backend/backends/base.py`:
- Around line 129-141: The XPU branch in empty_device_cache currently only
checks hasattr(torch, "xpu") before calling torch.xpu.empty_cache(); update it
to also verify torch.xpu.is_available() (similar to the CUDA branch's
torch.cuda.is_available()) and only call torch.xpu.empty_cache() when both the
attribute exists and is_available() returns True to avoid calling into
unavailable XPU APIs.
- Around line 144-157: The XPU branch in manual_seed lacks an availability check
like the CUDA branch, so update manual_seed to guard the XPU seed call by
checking torch.xpu.is_available() (in addition to hasattr(torch, "xpu")) before
calling torch.xpu.manual_seed(seed); locate the manual_seed function in
backend/backends/base.py and modify the XPU branch to mirror the CUDA guard to
avoid calling xpu APIs when no XPU is present.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 70851706-827f-4180-ab03-715d6ed5a660

📥 Commits

Reviewing files that changed from the base of the PR and between ffc1b54 and 83ebaba.

📒 Files selected for processing (7)
  • backend/backends/base.py
  • backend/backends/chatterbox_backend.py
  • backend/backends/chatterbox_turbo_backend.py
  • backend/backends/hume_backend.py
  • backend/backends/luxtts_backend.py
  • backend/backends/pytorch_backend.py
  • justfile

@oburtchen
Copy link

I checked it out, but unfortunately still no GPU acceleration. One thing I noted, in my system python installation, the intel pytorch extension was installed this way:

python -m pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/xpu
python -m pip install intel-extension-for-pytorch==2.8.10+xpu --index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/

So the extension comes from intel servers. Don't know if this makes a difference. Okay, the compile log is attached, just tell me if I can help in any way.
voicebox-compile-log.txt

@jamiepine
Copy link
Owner Author

Thanks for testing this and providing the feedback! We're looking into the Intel Arc acceleration issue now. The compile log you attached will be helpful for debugging. Will update you once we identify what's causing the GPU acceleration to not work despite the code changes.

…porting, and setup detection

Address CodeRabbit review feedback and user-reported GPU acceleration failure:

- Use shared manual_seed() in chatterbox, chatterbox_turbo, and luxtts
  backends so XPU (and future accelerators) get proper device seeding
- Add XPU branch to _get_gpu_status() so startup log reports Intel Arc
  GPUs instead of 'None (CPU only)'
- Add XPU VRAM reporting and correct backend_variant fallback in the
  /health endpoint
- Switch justfile GPU detection from Get-WmiObject to Get-CimInstance,
  simplify the Arc regex to match 'Arc' (not 'Intel.*Arc'), log
  detected GPUs, and print manual install instructions on miss

Resolves the root cause where IPEX was silently not installed due to
WMI detection failure, causing CPU-only fallback on Intel Arc systems.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
backend/routes/health.py (1)

113-117: Consider using contextlib.suppress for cleaner exception handling.

The try-except-pass pattern works correctly here for handling IPEX version variations, but can be simplified.

♻️ Suggested refactor
+import contextlib
+
 elif has_xpu:
-    try:
-        vram_used = torch.xpu.memory_allocated() / 1024 / 1024
-    except Exception:
-        pass  # memory_allocated() may not be available on all IPEX versions
+    with contextlib.suppress(Exception):
+        vram_used = torch.xpu.memory_allocated() / 1024 / 1024
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/routes/health.py` around lines 113 - 117, Replace the try/except/pass
around the XPU VRAM query with contextlib.suppress to make the exception
handling clearer: import contextlib at top, then in the branch that checks
has_xpu (the block calling torch.xpu.memory_allocated()) wrap the memory query
in a with contextlib.suppress(Exception): block and assign vram_used inside it
(keeping the same division by 1024 twice). Ensure vram_used remains
defined/unchanged if the call is suppressed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@backend/routes/health.py`:
- Around line 113-117: Replace the try/except/pass around the XPU VRAM query
with contextlib.suppress to make the exception handling clearer: import
contextlib at top, then in the branch that checks has_xpu (the block calling
torch.xpu.memory_allocated()) wrap the memory query in a with
contextlib.suppress(Exception): block and assign vram_used inside it (keeping
the same division by 1024 twice). Ensure vram_used remains defined/unchanged if
the call is suppressed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d1b75457-e665-4584-963f-e1e4dc1163f8

📥 Commits

Reviewing files that changed from the base of the PR and between 83ebaba and 7070462.

📒 Files selected for processing (6)
  • backend/app.py
  • backend/backends/chatterbox_backend.py
  • backend/backends/chatterbox_turbo_backend.py
  • backend/backends/luxtts_backend.py
  • backend/routes/health.py
  • justfile
🚧 Files skipped from review as they are similar to previous changes (2)
  • justfile
  • backend/backends/chatterbox_turbo_backend.py

@jamiepine jamiepine merged commit 9a955a7 into main Mar 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants