A requirements.txt, and docker file are provided
pip install -r requirements.txt
or
bash docker_build.sh
For RealEstate10K we use the same data format as pixelSplat. Please follow the data formating instructions provided there. You can also download a preprocessed dataset here. The dataset can be left in the zip file and loaded directly from it.
A subset of our synthetic multimodal dataset can be found here.
Training is down as follows.
For training rgb on RealEstate10K:
bash ./nvs.sh --ray_encoding plucker --pos_enc rore --gpus "0,1,..." --dataset_path /path/to/re10k.zip
For training rgb-thermal on our synthetic rgb-thermal dataset:
bash ./nvs.sh --ray_encoding plucker --pos_enc rore --gpus "0,1,..." --dataset_path /path/to/MultimodalBlender --dataset_type multimodal
Alternative embedding methods can be selected.
The validation on different zooming-in (focal length) factors can be done via:
bash ./nvs.sh --ray_encoding plucker --pos_enc rore --gpus "0,1,..." --test-zoom-in "2" --dataset_path "/path/to/re10k.zip"
And different synthetic distortion on re10k:
bash ./nvs.sh --ray_encoding plucker --pos_enc rore --gpus "0,1,..." --test-distortion "2" --dataset_path "/path/to/re10k.zip"
Testing only on the multimodal dataset:
bash ./nvs.sh --ray_encoding plucker --pos_enc rore --gpus "0,1,..." --dataset_path "/path/to/MultimodalBlender" --dataset_type multimodal --test-only
NOTE: Here the network architecture used for rgb and rgb-thermal are the same, for simplicity. In the paper for the multimodal networked differed slightly rgb only.
| Download | # Params | PSNR | SSIM | LPIPS |
|---|---|---|---|---|
| RE10K_12layers_768dim.pt | 48M | 28.79 | 0.8824 | 0.0483 |
This repository is built on top of LVSM and PRoPE repositories. Who we thank for making their work open-source.
If you find this work useful, please consider citing our work:
@inproceedings{rore2026,
title={RoRE: Rotary Ray Embedding for Generalised Multi-Modal Scene Understanding},
author={Ryan Griffiths and Donald G. Dansereau},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=BR2ItBcqOo}
}