Skip to content

feat: split CUDA backend into independently versioned server + libs archives#298

Merged
jamiepine merged 8 commits intomainfrom
feat/cuda-libs-addon
Mar 17, 2026
Merged

feat: split CUDA backend into independently versioned server + libs archives#298
jamiepine merged 8 commits intomainfrom
feat/cuda-libs-addon

Conversation

@jamiepine
Copy link
Owner

@jamiepine jamiepine commented Mar 17, 2026

Summary

Splits the CUDA backend from a single ~2.4GB --onefile binary into two independently versioned archives, eliminating full redownloads on every app version bump.

  • Switch CUDA PyInstaller build from --onefile to --onedir, then package into two .tar.gz archives: server core (~200-400MB, versioned with app) and CUDA libs (~2GB, versioned independently as cu126-v1)
  • Rewrite cuda.py download service for dual-archive extraction with independent version checks — only downloads what actually changed
  • Update main.rs to launch CUDA backend from backends/cuda/ directory with .current_dir() so PyInstaller finds NVIDIA DLLs relative to the exe

Download impact

Scenario Before After
First CUDA setup ~2.4 GB ~2.4 GB (same)
App version bump ~2.4 GB full redownload ~200-400 MB server core only
CUDA toolkit bump ~2.4 GB full redownload ~2 GB libs only

Files changed

  • backend/build_binary.py — CUDA builds use --onedir instead of --onefile
  • scripts/package_cuda.py — New script to split onedir output into server-core + cuda-libs archives with cuda-libs.json manifest
  • backend/services/cuda.py — Rewritten for dual-archive download, dual version checking, tar.gz extraction
  • tauri/src-tauri/src/main.rs — CUDA binary detection at backends/cuda/, .current_dir() on spawn
  • .github/workflows/release.yml — CI produces two archives instead of split binary parts
  • scripts/split_binary.py — Deprecated (replaced by package_cuda.py)

Closes #297

Summary by CodeRabbit

  • New Features

    • CUDA backend now ships as two archives (server core + CUDA libs) with independent versioning (cu126-v1), selective updates/caching, and ensured Unix executable permissions.
  • Chores

    • Release workflow now packages and uploads dual-archive assets with checksums and a libs manifest; legacy large-binary splitting removed; build artifacts ignored.
  • Documentation

    • Expanded developer guidance on frozen-build edge cases, shim patterns, and CUDA packaging trade-offs.

Enrich tts-engines.mdx with patterns discovered during TADA integration:
- Phase 0.2: new greps for @torch.jit.script, torchaudio.load, gated repos
- Phase 3.4: model naming inconsistency warning
- Phase 5.2: TADA shim failure added to lessons table
- Phase 6: four new workaround sections (gated repos, torchcodec,
  torch.jit.script, toxic dependency shim pattern)
- Checklist: four new items matching the new scan patterns
- Remove TADA from upcoming engines (now shipped)

Add CUDA_LIBS_ADDON.md exploring --onedir split to avoid 2.4GB
redownloads on every version bump.
…rchives

Switch CUDA builds from PyInstaller --onefile to --onedir and split the
output into two separately versioned archives:

1. Server core (~200-400MB) — versioned with the app, redownloaded on
   every app update
2. CUDA libs (~2GB) — versioned independently (cu126-v1), only
   redownloaded when the CUDA toolkit or torch version changes

This eliminates the ~2.4GB full redownload on every version bump.
After initial setup, most app updates only need ~200-400MB.

Closes #297
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Converts the CUDA build from a PyInstaller --onefile binary to an --onedir build split into two archives (server core + CUDA libs), updates CI packaging and release uploads, adds packaging and extraction tooling, revamps backend download/extract/manifest logic, and adapts runtime spawn/CWD and build tooling.

Changes

Cohort / File(s) Summary
CI/CD Workflow
.github/workflows/release.yml
Switch CUDA build to --onedir, replace scripts/split_binary.py with scripts/package_cuda.py, upload two tar.gz archives plus .sha256 and cuda-libs.json, update artifact paths and release action version.
Build System
backend/build_binary.py
Make PyInstaller pack mode dynamic: use --onedir for CUDA builds and --onefile for CPU builds.
CUDA Backend Service
backend/services/cuda.py
Rework to two-archive flow: add CUDA_LIBS_VERSION, helpers for cuda dir/exe/manifest, per-component needs* checks, generalized async download+extract with optional SHA256, persist cuda-libs.json, expose installed libs version, set exec perms, and delete whole cuda dir on removal.
Packaging Tooling
scripts/package_cuda.py
New script that splits a PyInstaller --onedir into voicebox-server-cuda.tar.gz (core) and cuda-libs-<version>.tar.gz (NVIDIA libs), writes .sha256 files and cuda-libs.json; adds is_nvidia_file, sha256_file, package, main.
Removed Tool
scripts/split_binary.py
Removed old chunk-splitting utility that produced 1.9GB parts, manifest, and checksum files.
Tauri Integration
tauri/src-tauri/src/main.rs
Treat CUDA backend as onedir: existence/version checks run in the onedir (cwd), return exe path within onedir, and spawn sets working directory to the onedir.
Docs & Planning
docs/content/docs/developer/tts-engines.mdx, docs/plans/CUDA_LIBS_ADDON.md, docs/.../building.mdx
Add design doc and developer guidance for two-archive approach, PyInstaller/frozen-build edge cases, shims, gated repos, and update build instructions targeting cu126.
Packaging / Justfile
justfile
Windows CUDA build step now copies the entire onedir into backends/cuda (remove+copy dir) instead of copying a single exe.
Repo housekeeping & versions
.gitignore, package.json variants, .bumpversion.cfg, backend/__init__.py
Add PyInstaller/Windows ignore patterns; bump package/module versions from 0.3.0 → 0.3.1 across manifests and configs.

Sequence Diagram(s)

sequenceDiagram
    participant App as App Client
    participant Service as CUDA Service
    participant Downloader as HTTP Downloader
    participant Extractor as Archive Extractor
    participant FS as File System

    App->>Service: check_and_update_cuda_binary()

    rect rgba(100,150,200,0.5)
    Note over Service: Server component check
    Service->>FS: get_cuda_binary_path() / get installed server version (cwd=onedir)
    Service->>Service: _needs_server_download()?
    end

    alt Server needs update
        Service->>Downloader: download server archive + .sha256
        Downloader->>Downloader: verify SHA-256
        Downloader->>Extractor: extract server archive -> cuda_dir
        Extractor->>FS: write server files
    end

    rect rgba(150,100,200,0.5)
    Note over Service: CUDA libs component check
    Service->>FS: read cuda-libs.json / get_installed_cuda_libs_version()
    Service->>Service: _needs_cuda_libs_download()?
    end

    alt Libs need update
        Service->>Downloader: download cuda-libs archive + .sha256
        Downloader->>Downloader: verify SHA-256
        Downloader->>Extractor: extract libs archive -> cuda_dir
        Extractor->>FS: write NVIDIA libs
    end

    rect rgba(200,150,100,0.5)
    Note over Service: Finalize install
    Service->>FS: set executable permissions (Unix)
    Service->>FS: write/update cuda-libs.json manifest
    end

    Service->>App: return updated status (includes cuda_libs_version)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped through folders, split the bulky load,

Packaged core and libs down a lighter road.
Two bundles now, each finds its place,
Fewer hops, quicker pace. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely describes the main change: splitting CUDA backend into independently versioned server and libs archives, which aligns perfectly with the primary objective of the changeset.
Linked Issues check ✅ Passed Code changes fully address issue #297 objectives: split CUDA packaging into server-core and libs archives with independent versioning, dual-archive download logic, manifest-based version tracking, and reduced post-install bandwidth from 2.4GB to ~200-400MB.
Out of Scope Changes check ✅ Passed All changes align with the stated objective of splitting CUDA backend architecture. Version bumps across all package.json and version files are standard release housekeeping and within scope for a version-bundled feature release.
Docstring Coverage ✅ Passed Docstring coverage is 86.36% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/cuda-libs-addon
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tauri/src-tauri/src/main.rs (1)

209-242: ⚠️ Potential issue | 🟠 Major

Validate the full CUDA layout before preferring the downloaded backend.

--version only proves the executable can start from cuda_dir; it does not prove the extracted nvidia/ tree or top-level CUDA DLLs are still present. With the new split packaging, a half-extracted install can pass this check and then fail during startup or first inference. Gate Some(exe_path) on an explicit structure check for both the core archive and the CUDA-libs payload.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tauri/src-tauri/src/main.rs` around lines 209 - 242, The version check
currently only verifies the executable starts from cuda_dir but can miss a
half-extracted payload; before returning Some(exe_path) inside the version_ok
branch, add an explicit layout validation that checks for the presence of the
expected extracted files (e.g. the nvidia/ directory tree and the top-level CUDA
DLLs or shared libraries and the core archive + cuda-libs payload) under
cuda_dir. Implement a helper like validate_cuda_layout(cuda_dir) (called right
before Some(exe_path)) that returns false if any required files or directories
are missing, and only return Some(exe_path) when both version_ok and
validate_cuda_layout(cuda_dir) are true. Ensure the check uses the same cuda_dir
and exe_path context so startup/inference won’t fail due to a half-extracted
install.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/release.yml:
- Around line 217-219: The workflow step named "Upload archives to GitHub
Release" currently uses softprops/action-gh-release@v1 which depends on Node 16
and will fail on current runners; update the action reference to
softprops/action-gh-release@v2 (replace softprops/action-gh-release@v1 with
softprops/action-gh-release@v2 in that step) so the step runs on the Node
20-based runtime used by v2.

In `@backend/services/cuda.py`:
- Around line 208-223: The extraction currently unpacks archives directly into
dest_dir (in backend/services/cuda.py) which can leave stale files; instead
extract into a staging directory (e.g., create a unique temp staging dir in the
same parent as dest_dir using tempfile.mkdtemp), update progress using
PROGRESS_KEY and progress_offset while extracting to that staging dir, verify
the extract completed successfully, then atomically swap the staging dir into
place (remove or rename the old dest_dir and os.replace or shutil.move the
staging dir to dest_dir) and only then unlink temp_path and log success; ensure
cleanup on failure (remove staging dir) so partial extracts never overlay the
live CUDA directory.
- Around line 159-168: The current code silently skips checksum verification
when fetching/parsing the .sha256 fails (sha256_url, sha_resp, expected_sha) and
proceeds to extract an unverified archive; change this to fail fast: in the
try/except around client.get(sha256_url) (and parsing the response into
expected_sha) propagate the error and abort the install instead of logging a
warning—e.g., on any non-200 status or exception from client.get or from parsing
sha_resp.text, log an error via logger.error including label and the underlying
exception/message and raise an exception (or return a failure sentinel) so
extraction never proceeds without a verified expected_sha. Ensure all callers of
the function that rely on expected_sha (the code paths that extract the archive)
handle the raised exception/false return.

In `@scripts/package_cuda.py`:
- Around line 102-110: When nvidia_files is empty in scripts/package_cuda.py,
stop the packaging process instead of just printing warnings: replace the
current warning-only branch that prints to sys.stderr with a fail-fast exit
(e.g., call sys.exit(1) or raise an exception) so the script aborts and no
cuda-libs-*.tar.gz is created/uploaded; ensure this change is applied where
nvidia_files is checked so the subsequent archive creation logic is not
executed.

---

Outside diff comments:
In `@tauri/src-tauri/src/main.rs`:
- Around line 209-242: The version check currently only verifies the executable
starts from cuda_dir but can miss a half-extracted payload; before returning
Some(exe_path) inside the version_ok branch, add an explicit layout validation
that checks for the presence of the expected extracted files (e.g. the nvidia/
directory tree and the top-level CUDA DLLs or shared libraries and the core
archive + cuda-libs payload) under cuda_dir. Implement a helper like
validate_cuda_layout(cuda_dir) (called right before Some(exe_path)) that returns
false if any required files or directories are missing, and only return
Some(exe_path) when both version_ok and validate_cuda_layout(cuda_dir) are true.
Ensure the check uses the same cuda_dir and exe_path context so
startup/inference won’t fail due to a half-extracted install.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3a3deb32-ad11-4644-b2f6-93f7ad574c26

📥 Commits

Reviewing files that changed from the base of the PR and between e789c93 and 564d787.

📒 Files selected for processing (8)
  • .github/workflows/release.yml
  • backend/build_binary.py
  • backend/services/cuda.py
  • docs/content/docs/developer/tts-engines.mdx
  • docs/plans/CUDA_LIBS_ADDON.md
  • scripts/package_cuda.py
  • scripts/split_binary.py
  • tauri/src-tauri/src/main.rs

Comment on lines +208 to +223
# Extract (use data filter for path traversal protection on Python 3.12+)
progress.update_progress(
PROGRESS_KEY,
current=progress_offset + downloaded,
total=total_size,
filename=f"Extracting {label}...",
status="downloading",
)
with tarfile.open(temp_path, "r:gz") as tar:
if sys.version_info >= (3, 12):
tar.extractall(path=dest_dir, filter="data")
else:
tar.extractall(path=dest_dir)

temp_path.unlink()
logger.info(f"{label}: extracted to {dest_dir}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid overlay extraction into the live CUDA directory.

This helper unpacks directly into backends/cuda/. When a new PyInstaller build removes or renames files, or extraction fails midway, stale files from the previous server/libs version survive and create a mixed onedir tree that can still look “installed”. Stage each archive and only swap or prune the component after a successful extract.

🧰 Tools
🪛 Ruff (0.15.6)

[warning] 217-217: Version block is outdated for minimum Python version

Remove outdated version block

(UP036)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/cuda.py` around lines 208 - 223, The extraction currently
unpacks archives directly into dest_dir (in backend/services/cuda.py) which can
leave stale files; instead extract into a staging directory (e.g., create a
unique temp staging dir in the same parent as dest_dir using tempfile.mkdtemp),
update progress using PROGRESS_KEY and progress_offset while extracting to that
staging dir, verify the extract completed successfully, then atomically swap the
staging dir into place (remove or rename the old dest_dir and os.replace or
shutil.move the staging dir to dest_dir) and only then unlink temp_path and log
success; ensure cleanup on failure (remove staging dir) so partial extracts
never overlay the live CUDA directory.

…lit_binary.py

- Fix is_nvidia_file() to match NVIDIA DLLs in _internal/torch/lib/
  (PyInstaller 6.18 + torch 2.10 no longer uses nvidia/ subdirectories)
- Remove deprecated split_binary.py (both archives are under 2GB)
- Update torch_compat range to >=2.6.0,<2.11.0
- Update build docs for new dual-archive packaging flow
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
scripts/package_cuda.py (1)

166-176: Manifest fields torch_compat, archive, and sha256 are currently unused.

The manifest includes these fields, but backend/services/cuda.py:get_installed_cuda_libs_version() only reads the "version" field. Consider whether these additional fields should be used for validation (e.g., verifying torch compatibility at runtime, or checking archive integrity during updates).

This isn't a bug—the extra fields provide useful metadata for future enhancements or manual debugging—but documenting their intended use would clarify whether they're placeholders or expected to be consumed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/package_cuda.py` around lines 166 - 176, The manifest written in
scripts/package_cuda.py includes fields "torch_compat", "archive", and "sha256"
that are not consumed by
backend/services/cuda.py:get_installed_cuda_libs_version(); either remove or
document them as metadata or update the consumer to validate using them. Fix by
choosing one: (A) if they are intended as metadata, add a brief comment in
scripts/package_cuda.py describing their purpose and intended future use so
reviewers know they are placeholders; or (B) if they should be used, update
get_installed_cuda_libs_version() to read and validate "torch_compat" against
the runtime torch version and verify "archive"/"sha256" (or at least checksum
presence) during install/update; reference the manifest keys "torch_compat",
"archive", "sha256" and the function get_installed_cuda_libs_version to locate
where to apply the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/package_cuda.py`:
- Around line 53-84: The nvidia/ subtree check in is_nvidia_file incorrectly
tests membership against a list slice; replace the faulty condition ("nvidia/"
in rel_lower.split("/", 1)[-1:]) with a proper substring check such as
("nvidia/" in rel_lower.split("/", 1)[-1]) or simply ("nvidia/" in rel_lower")
so nested nvidia/ directories (e.g. _internal/nvidia/...) are detected; update
the branch in is_nvidia_file that also references NVIDIA_KEEP_IN_CORE and
NVIDIA_DLL_PREFIXES accordingly.

---

Nitpick comments:
In `@scripts/package_cuda.py`:
- Around line 166-176: The manifest written in scripts/package_cuda.py includes
fields "torch_compat", "archive", and "sha256" that are not consumed by
backend/services/cuda.py:get_installed_cuda_libs_version(); either remove or
document them as metadata or update the consumer to validate using them. Fix by
choosing one: (A) if they are intended as metadata, add a brief comment in
scripts/package_cuda.py describing their purpose and intended future use so
reviewers know they are placeholders; or (B) if they should be used, update
get_installed_cuda_libs_version() to read and validate "torch_compat" against
the runtime torch version and verify "archive"/"sha256" (or at least checksum
presence) during install/update; reference the manifest keys "torch_compat",
"archive", "sha256" and the function get_installed_cuda_libs_version to locate
where to apply the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 78fbc69c-fc65-4059-bb50-380478dc6126

📥 Commits

Reviewing files that changed from the base of the PR and between 564d787 and 28e91ce.

📒 Files selected for processing (6)
  • .github/workflows/release.yml
  • .gitignore
  • docs/content/docs/developer/building.mdx
  • docs/plans/CUDA_LIBS_ADDON.md
  • scripts/package_cuda.py
  • scripts/split_binary.py
💤 Files with no reviewable changes (1)
  • scripts/split_binary.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@justfile`:
- Around line 211-214: The Copy-Item call is creating an extra top-level folder
so the executable ends up at
.../backends/cuda/voicebox-server-cuda/voicebox-server-cuda.exe, but the
launcher in main.rs expects .../backends/cuda/voicebox-server-cuda.exe; to fix,
ensure $dest is created first (e.g., New-Item -ItemType Directory -Force $dest)
and then copy the contents of backend/dist/voicebox-server-cuda into $dest
(e.g., Copy-Item "backend/dist/voicebox-server-cuda/*" $dest -Recurse -Force or
pipe Get-ChildItem "backend/dist/voicebox-server-cuda" | Copy-Item -Destination
$dest -Recurse -Force) so the binary lands directly at the expected path rather
than nested under an extra directory.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 53d0a6f1-21ff-48f6-923f-cffae2bde36d

📥 Commits

Reviewing files that changed from the base of the PR and between 28e91ce and 7d53699.

📒 Files selected for processing (1)
  • justfile

- Upgrade softprops/action-gh-release@v1 to @v2 (Node 16 EOL)
- Fail-fast on checksum fetch failure instead of extracting unverified archives
- Abort packaging if no NVIDIA files found (prevents empty cuda-libs archive)
- Fix nvidia/ path detection bug (list membership vs substring check)
- Fix justfile Copy-Item nesting (copy contents, not the directory itself)
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
backend/services/cuda.py (1)

208-223: ⚠️ Potential issue | 🟠 Major

Don't extract directly into the live CUDA tree.

Extracting each archive straight into backends/cuda/ means a server/libs update cannot remove files that disappeared from the previous component version, and a mid-extract failure leaves a mixed onedir tree that still looks installed. Please stage the component update and swap it into place only after a successful extract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/services/cuda.py` around lines 208 - 223, The code currently extracts
archives directly into dest_dir which can leave a partially-updated tree; change
the flow to extract into a temporary staging directory (e.g.,
dest_dir.with_name(f"{dest_dir.name}.staging.{uuid4()}") or similar) instead of
dest_dir, using the same tarfile.open(temp_path, "r:gz") and the existing
progress.update_progress calls, then after successful extract and cleanup of
temp_path, atomically replace the live directory by moving/staging_dir ->
dest_dir with os.replace/pathlib.Path.replace (or rename on same filesystem) so
the swap is atomic; ensure any existing dest_dir is removed or moved to a backup
before the replace, and clean up the staging directory on failure. Use the
existing symbols temp_path, dest_dir, label, PROGRESS_KEY, and
progress.update_progress to locate and update the logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/services/cuda.py`:
- Around line 153-223: The staged temp file (.download-*.tmp) can be left behind
on any exception after creation; wrap the download/verify/extract sequence (the
async with client.stream(...) block, the integrity check using hashlib.sha256(),
and tarfile.open(...).extractall()) in a try/finally so the finally always
checks temp_path.exists() and unlink()s it before re-raising/returning; remove
the existing temp_path.unlink() at the end and instead rely on the finally to
delete the file on both success and failure (or delete conditionally in finally
if you prefer), referencing temp_path, client.stream, hashlib.sha256, and
tarfile.open to locate the code to change.

---

Duplicate comments:
In `@backend/services/cuda.py`:
- Around line 208-223: The code currently extracts archives directly into
dest_dir which can leave a partially-updated tree; change the flow to extract
into a temporary staging directory (e.g.,
dest_dir.with_name(f"{dest_dir.name}.staging.{uuid4()}") or similar) instead of
dest_dir, using the same tarfile.open(temp_path, "r:gz") and the existing
progress.update_progress calls, then after successful extract and cleanup of
temp_path, atomically replace the live directory by moving/staging_dir ->
dest_dir with os.replace/pathlib.Path.replace (or rename on same filesystem) so
the swap is atomic; ensure any existing dest_dir is removed or moved to a backup
before the replace, and clean up the staging directory on failure. Use the
existing symbols temp_path, dest_dir, label, PROGRESS_KEY, and
progress.update_progress to locate and update the logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b5322fa8-3f34-4169-ad5d-96e7bb20b717

📥 Commits

Reviewing files that changed from the base of the PR and between 7d53699 and f96eae2.

📒 Files selected for processing (4)
  • .github/workflows/release.yml
  • backend/services/cuda.py
  • justfile
  • scripts/package_cuda.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • justfile

…r path

- Wrap download/verify/extract in try/finally so .download-*.tmp is
  always deleted, even on mid-download or extraction failures
- Fix justfile build-server-cuda to use sh.voicebox.app (production path)
@jamiepine jamiepine merged commit 0245c31 into main Mar 17, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CUDA backend installation issues - fix in progress

1 participant