Skip to content

Conversation

@s-bhatia1216
Copy link

APPLE: Enabling Shap-E to run on Apple Silicon GPUs via Metal Performance Shaders (MPS) Acceleration

Apple-Silicon (MPS) Support for Shap-E

Pull-Request

Status: Proposed – awaiting review
Branch: mps-support
Author: Sonal Bhatia, AI @ System Performance Architecture (HWE) at Apple Inc.

Target: main


What this PR delivers:

Area Change
Core Diffusion Code Adds a safe gather path in shap_e/diffusion/gaussian_diffusion.py that runs the indexing operation on CPU and then moves the tensor to MPS.
Example Notebooks Automatically select the best device (mps → cuda → cpu) instead of CUDA-only.
Performance For example, on a Mac mini (M4 Pro), default image-to-3D generation time drops from 4 hours to just under 4 minutes when switching from CPU to GPU via MPS.

Motivation

Shap-E falls back to CPU on Apple M-series machines because certain indexing ops are not yet supported by PyTorch-MPS. This PR removes that blocker, giving native-GPU performance on macOS without sacrificing CUDA/CPU compatibility.


Running an example notebook

jupyter lab examples/sample_text_to_3d.ipynb
# `device` cell should resolve to `mps` on Apple-Silicon,
# CUDA on NVIDIA systems, CPU otherwise.

Validation checklist

  1. Notebook test
    Execute sample_text_to_3d.ipynb and 'sample_image_to_3d.ipynb end-to-end; generation succeed and shows expected geometry.

  2. Performance
    Verify runtime improvements of Shap-E generations via the GPU over CPU on an M-series chip via the 'time' module in Python.


Commit message (for squash)

(mps-support): add PyTorch-MPS support & refresh notebooks

* _extract_into_tensor(): CPU gather → MPS tensor to bypass unsupported
  advanced indexing.
* Notebooks auto-select mps / cuda / cpu.
* Verified on M4 Pro & M4 Max (macOS 15.4).

Reviewer notes

  • CPU→MPS copy adds < 1 ms per call – negligible relative to diffusion loop.
  • Force-push friendly – rebase if your local main has diverged before merging.

License

This contribution is released under Shap-E's original MIT License.

* _extract_into_tensor(): CPU gather → MPS tensor to bypass unsupported advanced indexing.
* Notebooks auto-select mps / cuda / cpu.
* Verified on M4 Pro & M4 Max (macOS 15.4).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant