Abstract

Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering

Shanlin Sun^{1 ★} Yifan Wang^{2 ★} Hanwen Zhang^{3 ★} Yifeng Xiong¹ Qin Ren²
Ruogu Fang⁴ Xiaohui Xie¹ Chenyu You²

¹ University of California, Irvine ² Stony Brook University ³ Huazhong University of Science and Technology
⁴ University of Florida ^★ Equal Contribution

Abstract

While multi-step diffusion models have advanced both forward and inverse rendering, existing approaches often treat these problems independently, leading to cycle inconsistency and slow inference speed. In this work, we present Ouroboros, a framework composed of two single-step diffusion models that handle forward and inverse rendering with mutual reinforcement. Our approach extends intrinsic decomposition to both indoor and outdoor scenes and introduces a cycle consistency mechanism that ensures coherence between forward and inverse rendering outputs. Experimental results demonstrate state-of-the-art performance across diverse scenes while achieving substantially faster inference speed compared to other diffusion-based methods. We also demonstrate that Ouroboros can transfer to video decomposition in a training-free manner, reducing temporal inconsistency in video sequences while maintaining high-quality per-frame inverse rendering.

Figure: Single-step Diffusion Models for Forward and Inverse Rendering in Cycle Consistency.
Left Upper: Ouroboros decomposes input images into intrinsic maps (albedo, normal, roughness, metallicity, and irradiance). Given these generated intrinsic maps and textual prompts, our neural forward rendering model synthesizes images closely matching the originals.
Right Upper: We extend an end-to-end finetuning technique to diffusion-based neural rendering, outperforming state-of-the-art RGB↔X in both speed and accuracy. The radar plot illustrates numerical comparisons on the InteriorVerse dataset.
Bottom: Our method achieves temporally consistent video inverse rendering without specific finetuning on video data.

TODO List

Release inference codes and checkpoints.
Release training codes.
Release training dataset.

Notes

We generate masks for windows, mirrors, and other highly specular regions for our datasets so these areas do not bias training; the same masks are applied during evaluation and will ship with the data release.
We are rebalancing checkpoints to better trade off cycle consistency and rendering quality across datasets; more checkpoints are coming soon.

Installation

Prerequisites

Python 3.12
CUDA-compatible GPU
Conda package manager
FFmpeg (for optional video export)

Setup

Clone the repository:

git clone https://github.com/Y-Research-SBU/Ouroboros/tree/main#
cd Ouroboros

Create and activate the conda environment:

conda env create -f environment.yml
conda activate ouroboros

Usage

RGB to Material Properties (rgb2x)

Estimate material properties from RGB images:

python rgb2x/inference.py \
    --checkpoint="path/to/checkpoint" \
    --modality "normals" "albedo" "irradiance" "roughness" "metallicity" \
    --condition "rgb" \
    --noise "gaussian" \
    --input_rgb_path="path/to/input.jpg" \
    --output_dir="path/to/output"

Material Properties to RGB (x2rgb)

Generate RGB images from material properties:

python x2rgb/inference.py \
    --checkpoint="path/to/checkpoint" \
    --modality "rgb" \
    --condition "normals" "albedo" "irradiance" "roughness" "metallicity" \
    --noise "gaussian" \
    --prompt="your text prompt" \
    --albedo_path="path/to/albedo.png" \
    --normal_path="path/to/normal.png" \
    --roughness_path="path/to/roughness.png" \
    --metallic_path="path/to/metallic.png" \
    --irradiance_path="path/to/irradiance.png" \
    --output_dir="path/to/output"

Video Inference (RGB sequence to material properties)

Generate temporally consistent material property videos from an RGB frame sequence:

python video_inference.py \
  --checkpoint_path "path/to/checkpoint" \
  --input_dir "" \
  --save_dir "" \
  --num_frames 198 \
  --window_size 32 \
  --stride 16 \
  --required_aovs albedo \
  --device cuda \
  --half_precision

Notes:

input_dir must contain sequentially numbered frames starting at 0.jpg.
The environment configuration for video inference should follow the YAML file located in the video_infer folder.

Parameters

Common Parameters

--checkpoint: Path to model checkpoint or Hugging Face model name
--modality: List of modalities to generate/estimate
--condition: List of conditioning modalities
--noise: Noise type (gaussian, pyramid, zeros)
--denoise_steps: Number of denoising steps

RGB2X Specific

--input_rgb_path: Path to input RGB image
--output_dir: Output directory for material properties

X2RGB Specific

--prompt: Text prompt for generation
--albedo_path: Path to albedo image
--normal_path: Path to normal map
--roughness_path: Path to roughness map
--metallic_path: Path to metallic map
--irradiance_path: Path to irradiance map
--output_dir: Output directory for generated RGB

Contact

For questions, feedback, or collaboration opportunities, please contact:

Email: shanlins@uci.edu, chenyu.you@stonybrook.edu

Citation

If you use this code in your research, please cite with:

@article{sun2025ouroboros,
  title={Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering},
  author={Sun, Shanlin and Wang, Yifan and Zhang, Hanwen and Xiong, Yifeng and Ren, Qin and Fang, Ruogu and Xie, Xiaohui and You, Chenyu},
  journal={arXiv preprint arXiv:2508.14461},
  year={2025}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
demo		demo
img		img
rgb2x		rgb2x
video_infer		video_infer
x2rgb		x2rgb
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
rgb2x.sh		rgb2x.sh
x2rgb.sh		x2rgb.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering

Abstract

TODO List

Notes

Installation

Prerequisites

Setup

Usage

RGB to Material Properties (rgb2x)

Material Properties to RGB (x2rgb)

Video Inference (RGB sequence to material properties)

Parameters

Common Parameters

RGB2X Specific

X2RGB Specific

Contact

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering

Abstract

TODO List

Notes

Installation

Prerequisites

Setup

Usage

RGB to Material Properties (rgb2x)

Material Properties to RGB (x2rgb)

Video Inference (RGB sequence to material properties)

Parameters

Common Parameters

RGB2X Specific

X2RGB Specific

Contact

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages