Skip to content

vlasu19/OVA-Fields

Repository files navigation

OVA-Fields: Weakly Supervised Open-Vocabulary Affordance Fields for Robot Operational Part Detection

Heng Su · Mengying Xie · Nieqing Cao · Ding Yan · Beichen Shao
Xianlei Long · Fuqiang Gu · Chao Chen*

ICCV 2025

OVA-Fields enables robots to detect and interact with functional parts in real 3D scenes by mapping natural language commands to precise affordance locations.


Table of Contents
  1. Installation
  2. Data Preparation
  3. Checkpoints
  4. Run
  5. Acknowledgement
  6. Citation

News 🚩

  • [6/26/2025] Our paper is accepted to ICCV 2025
  • [12/12/2024] Code is released.

Installation

conda create -n ova python=3.10
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

cd OVA-Fields
pip install -r requirements.txt

cd gridencoder
conda install nvidia/label/cuda-12.1.0::cuda-nvcc
you can use command: `which nvcc` to see where the nvcc installed.
~which nvcc: /home/user/anaconda3/envs/ova/bin/nvcc
export CUDA_HOME=/home/user/anaconda3/envs/ova
python setup.py install

Data Preparation

Our data collection is conducted using the Record3D App on iPhone/iPad Pro equipped with a LiDAR module. This app efficiently captures RGB-D video frames while recording associated data, including camera intrinsics, extrinsics, and pose information. The collected data can be exported in .r3d format, which our code can directly read and process.

You can use your own device to collect data for custom scenes. During the data collection process, please keep the camera stable and ensure the entire object is captured. This will help improve the data quality and enhance the model's performance.

you can find more infomation at: Record3D

Additionally, we provide the dataset used in our paper, which can be downloaded at Google Drive, you can run the demo.ipynb file to quickly test the performance of our model.

Checkpoints

Our pretrained model can be downloaded from Google Drive.

Run

After setting up the required environment and obtaining the necessary data, you can train our model by following these steps:

  1. Modify the DATA_PATH in build_data.py to point to your data directory.

  2. Modify the SAVE_DIRECTORY in train.py to specify the path where you want to save the model.

  3. Run the following command to build the model:

    python build_data.py
  4. Run the following command to start training the model:

    python train.py

You can also easily run our code and obtain relevant visualizations by executing demo.ipynb.

Acknowledgement

We would like to extend our gratitude to CLIP-Fields, whose code has greatly supported our work.

Citation

If you find our code or paper useful, please cite

Su, H., Xie, M., Cao, N., Ding, Y., Shao, B., Long, X., ... & Chen, C. (2025). OVA-Fields: Weakly Supervised Open-Vocabulary Affordance Fields for Robot Operational Part Detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6385-6395).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors