A powerful 3D mesh editing framework using VecSet representation and attention-based mechanisms for precise, localized and image guided mesh edit. (arXiv Paper)
demo_vid_1229_v2.mp4
- Linux operating system
- NVIDIA GPU with CUDA support (CUDA 12.4)
- Python 3.11
- Conda environment manager
The installation order is critical due to CUDA compatibility requirements. Follow these steps carefully:
conda create -n vecset_edit python=3.11 -y
conda activate vecset_editMust install first - All other CUDA-dependent packages rely on this version.
pip install torch==2.6.0+cu124 torchvision==0.21.0+cu124 torchaudio==2.6.0+cu124 \
--index-url https://download.pytorch.org/whl/cu124Cannot be installed via pip - Requires downloading pre-built wheel from GitHub.
# Download the wheel file matching your CUDA version (12.4) and Python version (3.11)
# Visit: https://github.com/NVlabs/nvdiffrast/releases
pip install setuptools wheel ninja
pip install git+https://github.com/NVlabs/nvdiffrast.git --no-build-isolationRequires matching PyTorch and CUDA versions.
pip install torch-cluster -f https://pytorch-geometric.com/whl/torch-2.6.0+cu124.htmlMust be installed from GitHub repository.
pip install git+https://github.com/SarahWeiii/diso.git --no-build-isolation# Install all core dependencies at once
pip install -r other_requirements.txt# Blender Python API (for advanced rendering)
pip install bpy==4.0.0 mathutils==3.3.0
# Additional image enhancement
pip install gfpgan realesrgan facexlib basicsrcu124 (CUDA 12.4). Mixing versions will cause runtime errors.
nvdiffrast and potentially torch-cluster, you may need to manually download wheel files if automated downloads fail.
The main script for performing 3D mesh editing using VecSet representation and attention mechanisms.
python vecset_edit.py \
--input_dir example/chicken_racer \
--output_dir output \
--mesh_file model.glb \
--render_image 2d_render.png \
--edit_image 2d_edit.png \
--mask_image 2d_mask.pngInput/Output:
--input_dir: Directory containing input mesh and images (default:example/chicken_racer)--output_dir: Output directory for results (default:output)--mesh_file: Mesh filename in input directory (default:model.glb)--render_image: Original rendered image filename (default:2d_render.png)--edit_image: Edited 2D image filename (default:2d_edit.png)--mask_image: Binary mask image filename (default:2d_mask.png)
Camera Parameters:
--azimuth: Azimuth angle in radians (default:0.0)--elevation: Elevation angle in radians (default:0.0)
Processing Parameters:
--scale: Scale factor for point cloud (default:2.0)--attentive_2d: Number of attentive 2D tokens (default:8)--cut_off_p: Cut-off percentage for attention (default:0.5)--topk_percent_2d: Top k percent of 2D attentive tokens (default:0.2)--threshold_percent_2d: Threshold percent for 2D attention (default:0.1)--step_pruning: Pruning step interval (default:5)--edit_strength: Editing strength (default:0.7)--guidance_scale: Guidance scale for generation (default:7.5)
python vecset_edit.py \
--input_dir example/chicken_racer \
--output_dir output \
--edit_strength 0.8 \
--guidance_scale 7.5 \
--scale 2.0This script handles texture repaint and baking for 3D meshes while preserving the original texture quality.
python preserving_texture_baking.py \
--input_mesh output/edited_mesh.glb \
--ref_mesh output/source_model.glb \
--texture_image output/2d_edit.png \
--output_dir output/Required:
--input_mesh: Path to the edited mesh file (default:./output/edited_mesh.glb)--ref_mesh: Path to the reference/source mesh file (default:./output/source_model.glb)--texture_image: Path to the texture image for repaint (default:./output/2d_edit.png)--output_dir: Output directory for textured mesh (default:./output/)
Optional:
--seed: Random seed for reproducibility (default:99999)--render_method: Rendering method -nvdiffrastorbpy(default:nvdiffrast)
python preserving_texture_baking.py \
--input_mesh output/edited_mesh.glb \
--ref_mesh output/source_model.glb \
--texture_image output/2d_edit.png \
--output_dir output/ \
--seed 42 \
--render_method nvdiffrast- The script loads the edited mesh and reference mesh
- Performs texture repaint using multi-view generation
- Applies inpainting and upscaling to preserve texture quality
- Outputs the final textured mesh as
mv_repaint_model.glb
https://github.com/BlueDyee/VecSetEdit/assets/demo_video.mp4
Example: 3D mesh editing with image-guided attention mechanisms
For best viewing experience, you can also download the demo video directly.
The following pretrained weights are required and should be placed in the checkpoints/ directory:
-
big-lama.pt - Large-scale inpainting model
- Download from: LaMa GitHub
- Place in:
checkpoints/big-lama.pt
-
RealESRGAN_x2plus.pth - Image super-resolution model
- Download from: RealESRGAN GitHub
- Place in:
checkpoints/RealESRGAN_x2plus.pth
This project integrates and builds upon the following open-source projects:
- TripoSG: Used for 3D geometry generation and reconstruction
- MV-Adapter: Provides multi-view image generation capabilities
- Hunyuan3D-2.0: Utilized for texture painting and 3D generation features
We are grateful to the authors and contributors of these projects for making their work available to the research community.
Note: This is a research project. The code is provided as-is for academic and research purposes.
