Conversation
Auto-detect Intel Arc GPUs during Windows setup and install PyTorch with XPU support + intel-extension-for-pytorch. Enable allow_xpu=True on all TTS backends (Chatterbox, Chatterbox Turbo, Hume TADA, LuxTTS) that previously only supported CUDA. Add shared empty_device_cache() and manual_seed() helpers in base.py to handle XPU memory management and reproducible seeding alongside CUDA.
📝 WalkthroughWalkthroughAdds two device-agnostic utilities ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
backend/backends/chatterbox_backend.py (1)
200-201:⚠️ Potential issue | 🟡 MinorMissing XPU-aware seeding in generation.
The
generate()method usestorch.manual_seed(seed)directly, which only seeds the CPU. For XPU reproducibility, this should use the sharedmanual_seed(seed, device)helper likepytorch_backend.pyandhume_backend.pydo.🔧 Proposed fix
def _generate_sync(): import torch if seed is not None: - torch.manual_seed(seed) + manual_seed(seed, self._device)Also add the import at the top:
from .base import ( is_model_cached, get_torch_device, empty_device_cache, + manual_seed, combine_voice_prompts as _combine_voice_prompts, model_load_progress, patch_chatterbox_f32, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/chatterbox_backend.py` around lines 200 - 201, The generate() method currently calls torch.manual_seed(seed) which only seeds the CPU; update it to use the shared helper manual_seed(seed, device) used in pytorch_backend.py and hume_backend.py so XPU devices are seeded correctly for reproducibility. Import the manual_seed helper at top of chatterbox_backend.py and replace torch.manual_seed(seed) inside generate() with manual_seed(seed, device) where device is the model/device used for generation (ensure you reference the same device variable/name used in generate()).backend/backends/luxtts_backend.py (1)
168-171:⚠️ Potential issue | 🟡 MinorMissing XPU-aware seeding in generation.
The seeding logic uses
torch.manual_seed()andtorch.cuda.manual_seed()directly, missing XPU support. This should use the sharedmanual_seed(seed, self.device)helper for consistency with the rest of the PR.🔧 Proposed fix
def _generate_sync(): import torch if seed is not None: - torch.manual_seed(seed) - if torch.cuda.is_available(): - torch.cuda.manual_seed(seed) + manual_seed(seed, self.device)And add the import:
from .base import ( is_model_cached, get_torch_device, empty_device_cache, + manual_seed, combine_voice_prompts as _combine_voice_prompts, model_load_progress, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/luxtts_backend.py` around lines 168 - 171, Replace the direct torch seeding calls in the generation path with the shared helper so XPU devices are handled consistently: instead of torch.manual_seed(seed) / torch.cuda.manual_seed(seed) use manual_seed(seed, self.device) (the same helper used elsewhere in this PR), and ensure the manual_seed symbol is imported at the top of the module; update the block in LuxttsBackend (the seed handling near generation) to call manual_seed(seed, self.device) when seed is not None.backend/backends/chatterbox_turbo_backend.py (1)
181-182:⚠️ Potential issue | 🟡 MinorMissing XPU-aware seeding in generation (same as chatterbox_backend.py).
For consistency and XPU reproducibility, use
manual_seed(seed, self._device)instead oftorch.manual_seed(seed).🔧 Proposed fix
def _generate_sync(): import torch if seed is not None: - torch.manual_seed(seed) + manual_seed(seed, self._device)And add the import:
from .base import ( is_model_cached, get_torch_device, empty_device_cache, + manual_seed, combine_voice_prompts as _combine_voice_prompts, model_load_progress, patch_chatterbox_f32, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/chatterbox_turbo_backend.py` around lines 181 - 182, Replace the call to torch.manual_seed(seed) with a device-aware seeding helper by calling manual_seed(seed, self._device) (use the existing self._device in the class/method where the seed is set), and add the required import for manual_seed at the top of the module; locate the occurrence around the code that currently calls torch.manual_seed(seed) and update it to manual_seed(seed, self._device).
🧹 Nitpick comments (2)
backend/backends/base.py (2)
129-141: Consider addingtorch.xpu.is_available()guard for XPU cache clearing.The CUDA branch checks
torch.cuda.is_available(), but the XPU branch only checkshasattr(torch, "xpu"). For consistency and robustness, consider also verifyingtorch.xpu.is_available()before callingempty_cache().♻️ Suggested improvement
def empty_device_cache(device: str) -> None: ... import torch if device == "cuda" and torch.cuda.is_available(): torch.cuda.empty_cache() - elif device == "xpu" and hasattr(torch, "xpu"): + elif device == "xpu" and hasattr(torch, "xpu") and torch.xpu.is_available(): torch.xpu.empty_cache()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/base.py` around lines 129 - 141, The XPU branch in empty_device_cache currently only checks hasattr(torch, "xpu") before calling torch.xpu.empty_cache(); update it to also verify torch.xpu.is_available() (similar to the CUDA branch's torch.cuda.is_available()) and only call torch.xpu.empty_cache() when both the attribute exists and is_available() returns True to avoid calling into unavailable XPU APIs.
144-157: Same suggestion formanual_seedXPU branch.For symmetry with the CUDA check, add
torch.xpu.is_available()guard.♻️ Suggested improvement
def manual_seed(seed: int, device: str) -> None: ... torch.manual_seed(seed) if device == "cuda" and torch.cuda.is_available(): torch.cuda.manual_seed(seed) - elif device == "xpu" and hasattr(torch, "xpu"): + elif device == "xpu" and hasattr(torch, "xpu") and torch.xpu.is_available(): torch.xpu.manual_seed(seed)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/backends/base.py` around lines 144 - 157, The XPU branch in manual_seed lacks an availability check like the CUDA branch, so update manual_seed to guard the XPU seed call by checking torch.xpu.is_available() (in addition to hasattr(torch, "xpu")) before calling torch.xpu.manual_seed(seed); locate the manual_seed function in backend/backends/base.py and modify the XPU branch to mirror the CUDA guard to avoid calling xpu APIs when no XPU is present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@backend/backends/chatterbox_backend.py`:
- Around line 200-201: The generate() method currently calls
torch.manual_seed(seed) which only seeds the CPU; update it to use the shared
helper manual_seed(seed, device) used in pytorch_backend.py and hume_backend.py
so XPU devices are seeded correctly for reproducibility. Import the manual_seed
helper at top of chatterbox_backend.py and replace torch.manual_seed(seed)
inside generate() with manual_seed(seed, device) where device is the
model/device used for generation (ensure you reference the same device
variable/name used in generate()).
In `@backend/backends/chatterbox_turbo_backend.py`:
- Around line 181-182: Replace the call to torch.manual_seed(seed) with a
device-aware seeding helper by calling manual_seed(seed, self._device) (use the
existing self._device in the class/method where the seed is set), and add the
required import for manual_seed at the top of the module; locate the occurrence
around the code that currently calls torch.manual_seed(seed) and update it to
manual_seed(seed, self._device).
In `@backend/backends/luxtts_backend.py`:
- Around line 168-171: Replace the direct torch seeding calls in the generation
path with the shared helper so XPU devices are handled consistently: instead of
torch.manual_seed(seed) / torch.cuda.manual_seed(seed) use manual_seed(seed,
self.device) (the same helper used elsewhere in this PR), and ensure the
manual_seed symbol is imported at the top of the module; update the block in
LuxttsBackend (the seed handling near generation) to call manual_seed(seed,
self.device) when seed is not None.
---
Nitpick comments:
In `@backend/backends/base.py`:
- Around line 129-141: The XPU branch in empty_device_cache currently only
checks hasattr(torch, "xpu") before calling torch.xpu.empty_cache(); update it
to also verify torch.xpu.is_available() (similar to the CUDA branch's
torch.cuda.is_available()) and only call torch.xpu.empty_cache() when both the
attribute exists and is_available() returns True to avoid calling into
unavailable XPU APIs.
- Around line 144-157: The XPU branch in manual_seed lacks an availability check
like the CUDA branch, so update manual_seed to guard the XPU seed call by
checking torch.xpu.is_available() (in addition to hasattr(torch, "xpu")) before
calling torch.xpu.manual_seed(seed); locate the manual_seed function in
backend/backends/base.py and modify the XPU branch to mirror the CUDA guard to
avoid calling xpu APIs when no XPU is present.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 70851706-827f-4180-ab03-715d6ed5a660
📒 Files selected for processing (7)
backend/backends/base.pybackend/backends/chatterbox_backend.pybackend/backends/chatterbox_turbo_backend.pybackend/backends/hume_backend.pybackend/backends/luxtts_backend.pybackend/backends/pytorch_backend.pyjustfile
|
I checked it out, but unfortunately still no GPU acceleration. One thing I noted, in my system python installation, the intel pytorch extension was installed this way: So the extension comes from intel servers. Don't know if this makes a difference. Okay, the compile log is attached, just tell me if I can help in any way. |
|
Thanks for testing this and providing the feedback! We're looking into the Intel Arc acceleration issue now. The compile log you attached will be helpful for debugging. Will update you once we identify what's causing the GPU acceleration to not work despite the code changes. |
…porting, and setup detection Address CodeRabbit review feedback and user-reported GPU acceleration failure: - Use shared manual_seed() in chatterbox, chatterbox_turbo, and luxtts backends so XPU (and future accelerators) get proper device seeding - Add XPU branch to _get_gpu_status() so startup log reports Intel Arc GPUs instead of 'None (CPU only)' - Add XPU VRAM reporting and correct backend_variant fallback in the /health endpoint - Switch justfile GPU detection from Get-WmiObject to Get-CimInstance, simplify the Arc regex to match 'Arc' (not 'Intel.*Arc'), log detected GPUs, and print manual install instructions on miss Resolves the root cause where IPEX was silently not installed due to WMI detection failure, causing CPU-only fallback on Intel Arc systems.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
backend/routes/health.py (1)
113-117: Consider usingcontextlib.suppressfor cleaner exception handling.The try-except-pass pattern works correctly here for handling IPEX version variations, but can be simplified.
♻️ Suggested refactor
+import contextlib + elif has_xpu: - try: - vram_used = torch.xpu.memory_allocated() / 1024 / 1024 - except Exception: - pass # memory_allocated() may not be available on all IPEX versions + with contextlib.suppress(Exception): + vram_used = torch.xpu.memory_allocated() / 1024 / 1024🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/health.py` around lines 113 - 117, Replace the try/except/pass around the XPU VRAM query with contextlib.suppress to make the exception handling clearer: import contextlib at top, then in the branch that checks has_xpu (the block calling torch.xpu.memory_allocated()) wrap the memory query in a with contextlib.suppress(Exception): block and assign vram_used inside it (keeping the same division by 1024 twice). Ensure vram_used remains defined/unchanged if the call is suppressed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@backend/routes/health.py`:
- Around line 113-117: Replace the try/except/pass around the XPU VRAM query
with contextlib.suppress to make the exception handling clearer: import
contextlib at top, then in the branch that checks has_xpu (the block calling
torch.xpu.memory_allocated()) wrap the memory query in a with
contextlib.suppress(Exception): block and assign vram_used inside it (keeping
the same division by 1024 twice). Ensure vram_used remains defined/unchanged if
the call is suppressed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d1b75457-e665-4584-963f-e1e4dc1163f8
📒 Files selected for processing (6)
backend/app.pybackend/backends/chatterbox_backend.pybackend/backends/chatterbox_turbo_backend.pybackend/backends/luxtts_backend.pybackend/routes/health.pyjustfile
🚧 Files skipped from review as they are similar to previous changes (2)
- justfile
- backend/backends/chatterbox_turbo_backend.py
Summary
Adds Intel Arc GPU (XPU) support so that users with Intel Arc A-series GPUs (e.g. A770, A750) can use hardware acceleration instead of falling back to CPU.
Prompted by user feedback: an Intel Arc A770 16GB owner installed
intel-extension-for-pytorchsystem-wide, but voicebox's venv didn't include it, and most backends didn't check for XPU anyway.Changes
justfile— Windowssetup-pythonnow auto-detects Intel Arc GPUs (viaWin32_VideoController) and installstorchfrom the XPU wheel index +intel-extension-for-pytorch, mirroring the existing NVIDIA/CUDA auto-detectionbackend/backends/base.py— Added two shared helpers:empty_device_cache(device)— clears VRAM on CUDA or XPU after model unloadmanual_seed(seed, device)— sets reproducible seeds on CPU + CUDA or XPUallow_xpu=Truetoget_torch_device():chatterbox_backend.pychatterbox_turbo_backend.pyhume_backend.py(also enables bf16 on XPU — Intel Arc supports it natively)luxtts_backend.pypytorch_backend.py(already hadallow_xpu=True, now uses shared helpers)torch.cuda.empty_cache()/torch.cuda.manual_seed()calls with the shared helpers so XPU gets the same treatmentDevice priority
The existing priority in
get_torch_device()is unchanged: CUDA > XPU > DirectML > MPS > CPU. Users with both NVIDIA and Intel GPUs will still use CUDA.Testing notes
setup-pythondetection usesIntel.*Arcregex, so Intel integrated GPUs (UHD, Iris) won't trigger the XPU installSummary by CodeRabbit
New Features
Chores