diff --git a/README.md b/README.md index 5955753..bc645cb 100644 --- a/README.md +++ b/README.md @@ -50,7 +50,7 @@ wget https://ml-site.cdn-apple.com/models/sharp/sharp_2572gikvuh.pt To use a manually downloaded checkpoint, specify it with the `-c` flag: ``` -sharp predict -i /path/to/input/images -o /path/to/output/gaussians -c sharp_2572gikvuh.pt +sharp predict -i input -o output/gaussians -c sharp_2572gikvuh.pt ``` The results will be 3D gaussian splats (3DGS) in the output folder. The 3DGS `.ply` files are compatible to various public 3DGS renderers. We follow the OpenCV coordinate convention (x right, y down, z forward). The 3DGS scene center is roughly at (0, 0, +z). When dealing with 3rdparty renderers, please scale and rotate to re-center the scene accordingly. @@ -89,6 +89,31 @@ If you find our work useful, please cite the following paper: Our codebase is built using multiple opensource contributions, please see [ACKNOWLEDGEMENTS](ACKNOWLEDGEMENTS) for more details. +## Spatial Computing & Vision Pro Integration + +SHARP's single-image 3D Gaussian output can be integrated into Apple Vision Pro workflows using complementary tools from our [Spatial Computing & 3D Resources](Spatial%20Computing%20%26%203D%20Resources.md) list. + +### Asset Pipeline + +``` +[Photo] → SHARP → [3DGS .ply] → Convert3D → [USDZ/GLB] → Vision Pro / Safari +``` + +### Recommended Tools + +| Category | Tool | Usage | +| ----------------- | ----------------------------------------------- | --------------------------------------------- | +| **WebXR Viewer** | [Google Model Viewer](https://modelviewer.dev/) | Display converted assets in Safari (visionOS) | +| **Converter** | [Convert3D](https://convert3d.org/convert/glb) | `.ply` → `.glb` → `.usdz` conversion | +| **AI Generation** | [Zoo Text-to-CAD](https://zoo.dev/text-to-cad) | Complement SHARP with procedural models | +| **Reference** | [Dimensions.com](https://www.dimensions.com/) | Real-world dimensions for proper scaling | + +### Notes + +- **Video Rendering**: Currently requires CUDA GPU (MPS not supported for `--render`) +- **Format Conversion**: SHARP outputs `.ply` files; conversion to USDZ is needed for Vision Pro +- **Coordinate System**: OpenCV convention (x right, y down, z forward) - adjust for third-party renderers + ## License Please check out the repository [LICENSE](LICENSE) before using the provided code and diff --git a/Spatial Computing & 3D Resources.md b/Spatial Computing & 3D Resources.md new file mode 100644 index 0000000..ce04899 --- /dev/null +++ b/Spatial Computing & 3D Resources.md @@ -0,0 +1,50 @@ +# Spatial Computing & 3D Resources (Vision Pro Ready) + +A curated list of tools, libraries, and platforms for creating 3D assets, implementing WebXR experiences, and automating CAD workflows for Apple Vision Pro and Spatial Computing. + +## 🥽 WebXR & Spatial Viewers +Tools for rendering 3D models directly in the browser (Safari on visionOS) or creating immersive web experiences. + +* **Google Model Viewer** - [Documentation](https://modelviewer.dev/) | [Editor](https://modelviewer.dev/editor/) (Essential for AR Quick Look & USDZ/GLB web display) +* **Vectary** - [3D Design Tool for Web](https://www.vectary.com/3d-design-tool/) (Great for prototyping spatial UIs) +* **Threekit** - [Visual Configuration Platform](https://www.threekit.com/) (High-fidelity web rendering) +* **VisionThree** - [Immersive Solutions](https://visionthree.io/) +* **CAD Exchanger** - [Web Viewer](https://viewer.cadexchanger.com/) (View complex CAD files in browser) + +## 🤖 AI-Driven 3D Generation (Rapid Prototyping) +Generate assets for your spatial environments using AI prompts. + +* **Zoo (formerly KittyCAD)** - [Text-to-CAD API](https://zoo.dev/text-to-cad) | [Playground](https://text-to-cad.zoo.dev/) (Generate parametric models via code/API) +* **Vibe3D** - [AI 3D Generation](https://vibe3d.ai/) +* **Meta Segment Anything** - [Demo Gallery](https://aidemos.meta.com/segment-anything/gallery/) (Texture/Asset segmentation) +* **OneClick3D** - [Main Site](https://www.oneclick3d.io/) +* **3D Agent** - [Main Site](https://3d-agent.com/) + +## 🛠️ Programmatic CAD (Code-to-Geometry) +Create precise, parametric geometry for visionOS apps using code scripts. + +* **OpenSCAD** - [Main Site](https://openscad.org/index.html) | [Studio (Web)](https://zacharyfmarion.github.io/openscad-studio/) +* **CadQuery** - [GitHub Repo](https://github.com/CadQuery/cadquery) (Python-based, powerful for generating industrial assets) +* **Replicad** - [Docs & Examples](https://replicad.xyz/docs/examples/simple-vase) (JavaScript-based) +* **Jerm CAD** - [GitHub Repo](https://github.com/jeremyaboyd/jerm-cad) +* **ImplicitCAD** - [Main Site](https://www.implicitcad.org/) + +## 📐 Base Modeling & CAD Tools +Foundational tools for creating high-quality 3D assets. + +* **FreeCAD** - [Main Site](https://www.freecad.org/?lang=de) | [Tutorials](https://alsado.de/freecad-tutorials) +* **SolidWorks** - [Product Page](https://www.solidworks.com/product/solidworks-3d-cad) +* **SolveSpace** - [Parametric 2D/3D CAD](https://solvespace.com/index.pl) +* **QCAD** - [2D CAD](https://www.qcad.org/en/) + +## 📂 Converters & Utilities (USDZ/GLB) +Essential utilities for converting assets to Apple's preferred formats. + +* **Convert3D** - [GLB Converter](https://convert3d.org/convert/glb) +* **GLB to PNG** - [Converter](https://www.glb2png.com/#demo) +* **Dimensions.com** - [Reference Database](https://www.dimensions.com/) (Real-world object dimensions for scaling) + +## 📦 Open Source Repos +* **Nova** - [GitHub](https://github.com/agg111/nova/tree/main) +* **Chili3D** - [GitHub](https://github.com/xiangechen/chili3d) +* **PyOpticL** - [GitHub](https://github.com/UMassIonTrappers/PyOpticL) \ No newline at end of file diff --git a/src/sharp/utils/gaussians.py b/src/sharp/utils/gaussians.py index ed73de8..616e925 100644 --- a/src/sharp/utils/gaussians.py +++ b/src/sharp/utils/gaussians.py @@ -348,6 +348,14 @@ def save_ply( gaussians: Gaussians3D, f_px: float, image_shape: tuple[int, int], path: Path ) -> PlyData: """Save a predicted Gaussian3D to a ply file.""" + #Rotate model 180 degrees + LOGGER.info("Applying automatic 180-degree rotation fix to bring model to front.") + transform_fix = torch.tensor([ + [-1.0, 0.0, 0.0, 0.0], + [ 0.0, 1.0, 0.0, 0.0], + [ 0.0, 0.0, -1.0, 0.0] + ], device=gaussians.mean_vectors.device) + gaussians = apply_transform(gaussians, transform_fix) def _inverse_sigmoid(tensor: torch.Tensor) -> torch.Tensor: return torch.log(tensor / (1.0 - tensor))