PUGS

The repository provides code for the paper "PUGS: Zero-shot Physical Understanding with Gaussian Splatting".

🎉 Our paper have been accepted by ICRA 2025 🎉

pugs.mp4

Some qualitative results

qualitative_result.mp4

Prerequisites

Installation

We recommend using conda to install the dependencies.

conda env create -f environment.yml

conda activate pugs

Data Preparation

Follow the NeRF2Physics, we also use the ABO-500 dataset for testing. You can download the dataset here. The data should be organized as follows:

data
├── abo_500
    ├── scenes
    │   ├── scene0000
    │   │   ├── images
    │   │   │   ├── image0000.jpg
    │   │   │   └── ...
    │   │   └── transforms.json
    │   └── ...
    ├── filtered_product_weights.json
    └── splits.json

If you want to use your own data, you should organize the data in the same way or other formats which can be parsed by the scene/dataset_readers.py.

Pretrained Model

In our reconstruction pipeline, we use SAM to get regions of the object. In default, we use the public ViT-H model for SAM. You can download the model from here and put it under the ./submodules/segment-anything/sam_ckpt/ directory.

OpenAI API Key

Our method uses VLM to predict the physical properties of the object. During the inference, we use OpenAI API to get the physical properties. You need to get an OpenAI API key and put it in the my_api_key.py file. You can get the key from OpenAI, and set a variable named OPENAI_API_KEY in the my_api_key.py file.

echo "OPENAI_API_KEY = '<yourkey>'" >> ./my_api_key.py

Pipeline

Our pipeline is shown below, each step is a separate python script. Related arguments can be found in the settings.py.

Shape Aware 3DGS Reconstruction

Firstly, we use 3DGS to reconstruct the object from multi-view images. During the training, we use Geometry-Aware Regularization Loss and Region-Aware Feature Contrastive Loss to improve the quality of the reconstruction.

python gs_reconstruction.py

VLM Based Physical Property Prediction

We use VLM to predict the physical properties of the object. You can specify the property name using --property_name. And you can specify the type of the inference using --proposal_type. If you want to use VLM to predict the physical properties of the object, you can specify the type of the inference as gpt4o or gpt4v.

python material_proposal.py --property_name <property-name> --mats_save_name info --proposal_type <gpt4o|gpt4v>

We also provide a text-reasoning based inference. You can specify the type of the inference as text-reasoning to use this mode. The text-reasoning based inference means two-stage inference. First, we use VLM to generate the caption of the object. Then, we use LLM to predict the physical properties. Therefore, you need to specify the name of the saved caption using --caption_load_name.

python material_proposal.py --property_name <property-name> --caption_load_name info --mats_save_name info --proposal_type text-reasoning

Feature Based Property Propagation

This step gets the clip feature for source points, which will be used for property propagation.

python clip_feature_fusion.py

Then we can predict the physical properties of the object. The following command specify the prediction mode as grid, which can get dense prediction result of the physical properties about specific property.

python predict_property.py --mats_load_name info --property_name <property-name> --prediction_mode grid

If you want to predict the object-level physical properties, you can specify the type of the inference as integral, and specify the method for volume estimation in volume_method.

python predict_property.py --mats_load_name info --property_name <property-name> --prediction_mode integral --volume_method gaussian --preds_save_name mass

Other Utilities

We also provide some other utilities for evaluation and visualization.

Evaluation

This script evaluates the predictions. You can specify the path to the predictions and ground truth using --preds_json_path and --gts_json_path.

python evaluation.py --preds_json_path <path-to-predictions> --gts_json_path <path-to-ground-truth>

Visualization

For the reconstruction results, you can use the following command to visualize the results.

python visualization.py --scene_name <scene-name> --property_name <property-name> --value_low <value-low> --value_high <value-high>

Video Rendering

You can use the following command for video rendering, you can render the 360 degree video about the reconstructed object.

python video_render.py -m <path-to-model> --render_path --export_traj

Acknowledgements

Some parts of the code are borrowed from NeRF2Physics, SegAnyGaussians, PGSR and 2DGS. We thank the authors for their great work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PUGS

Prerequisites

Installation

Data Preparation

Pretrained Model

OpenAI API Key

Pipeline

Shape Aware 3DGS Reconstruction

VLM Based Physical Property Prediction

Feature Based Property Propagation

Other Utilities

Evaluation

Visualization

Video Rendering

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
arguments		arguments
assets		assets
gaussian_renderer		gaussian_renderer
reconstruction		reconstruction
scene		scene
submodules		submodules
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
clip_feature_fusion.py		clip_feature_fusion.py
environment.yml		environment.yml
evaluation.py		evaluation.py
gs_reconstruction.py		gs_reconstruction.py
material_proposal.py		material_proposal.py
predict_property.py		predict_property.py
render.py		render.py
settings.py		settings.py
video_render.py		video_render.py
visualization.py		visualization.py

License

EverNorif/PUGS

Folders and files

Latest commit

History

Repository files navigation

PUGS

Prerequisites

Installation

Data Preparation

Pretrained Model

OpenAI API Key

Pipeline

Shape Aware 3DGS Reconstruction

VLM Based Physical Property Prediction

Feature Based Property Propagation

Other Utilities

Evaluation

Visualization

Video Rendering

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages