Skip to content

Y-Research-SBU/Ouroboros

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering

Shanlin Sun1 ★  Yifan Wang2 ★  Hanwen Zhang3 ★  Yifeng Xiong1  Qin Ren2 
Ruogu Fang4  Xiaohui Xie1  Chenyu You2

1 University of California, Irvine    2 Stony Brook University    3 Huazhong University of Science and Technology   
4 University of Florida    Equal Contribution

Paper Project Website HuggingFace Model

Abstract

While multi-step diffusion models have advanced both forward and inverse rendering, existing approaches often treat these problems independently, leading to cycle inconsistency and slow inference speed. In this work, we present Ouroboros, a framework composed of two single-step diffusion models that handle forward and inverse rendering with mutual reinforcement. Our approach extends intrinsic decomposition to both indoor and outdoor scenes and introduces a cycle consistency mechanism that ensures coherence between forward and inverse rendering outputs. Experimental results demonstrate state-of-the-art performance across diverse scenes while achieving substantially faster inference speed compared to other diffusion-based methods. We also demonstrate that Ouroboros can transfer to video decomposition in a training-free manner, reducing temporal inconsistency in video sequences while maintaining high-quality per-frame inverse rendering.

Figure: Single-step Diffusion Models for Forward and Inverse Rendering in Cycle Consistency.
Left Upper: Ouroboros decomposes input images into intrinsic maps (albedo, normal, roughness, metallicity, and irradiance). Given these generated intrinsic maps and textual prompts, our neural forward rendering model synthesizes images closely matching the originals.
Right Upper: We extend an end-to-end finetuning technique to diffusion-based neural rendering, outperforming state-of-the-art RGB↔X in both speed and accuracy. The radar plot illustrates numerical comparisons on the InteriorVerse dataset.
Bottom: Our method achieves temporally consistent video inverse rendering without specific finetuning on video data.

TODO List

  • Release inference codes and checkpoints.
  • Release training codes.
  • Release training dataset.

Notes

  • We generate masks for windows, mirrors, and other highly specular regions for our datasets so these areas do not bias training; the same masks are applied during evaluation and will ship with the data release.
  • We are rebalancing checkpoints to better trade off cycle consistency and rendering quality across datasets; more checkpoints are coming soon.

Installation

Prerequisites

  • Python 3.12
  • CUDA-compatible GPU
  • Conda package manager
  • FFmpeg (for optional video export)

Setup

  1. Clone the repository:
git clone https://github.com/Y-Research-SBU/Ouroboros/tree/main#
cd Ouroboros
  1. Create and activate the conda environment:
conda env create -f environment.yml
conda activate ouroboros

Usage

RGB to Material Properties (rgb2x)

Estimate material properties from RGB images:

python rgb2x/inference.py \
    --checkpoint="path/to/checkpoint" \
    --modality "normals" "albedo" "irradiance" "roughness" "metallicity" \
    --condition "rgb" \
    --noise "gaussian" \
    --input_rgb_path="path/to/input.jpg" \
    --output_dir="path/to/output"

Material Properties to RGB (x2rgb)

Generate RGB images from material properties:

python x2rgb/inference.py \
    --checkpoint="path/to/checkpoint" \
    --modality "rgb" \
    --condition "normals" "albedo" "irradiance" "roughness" "metallicity" \
    --noise "gaussian" \
    --prompt="your text prompt" \
    --albedo_path="path/to/albedo.png" \
    --normal_path="path/to/normal.png" \
    --roughness_path="path/to/roughness.png" \
    --metallic_path="path/to/metallic.png" \
    --irradiance_path="path/to/irradiance.png" \
    --output_dir="path/to/output"

Video Inference (RGB sequence to material properties)

Generate temporally consistent material property videos from an RGB frame sequence:

python video_inference.py \
  --checkpoint_path "path/to/checkpoint" \
  --input_dir "" \
  --save_dir "" \
  --num_frames 198 \
  --window_size 32 \
  --stride 16 \
  --required_aovs albedo \
  --device cuda \
  --half_precision

Notes:

  • input_dir must contain sequentially numbered frames starting at 0.jpg.
  • The environment configuration for video inference should follow the YAML file located in the video_infer folder.

Parameters

Common Parameters

  • --checkpoint: Path to model checkpoint or Hugging Face model name
  • --modality: List of modalities to generate/estimate
  • --condition: List of conditioning modalities
  • --noise: Noise type (gaussian, pyramid, zeros)
  • --denoise_steps: Number of denoising steps

RGB2X Specific

  • --input_rgb_path: Path to input RGB image
  • --output_dir: Output directory for material properties

X2RGB Specific

  • --prompt: Text prompt for generation
  • --albedo_path: Path to albedo image
  • --normal_path: Path to normal map
  • --roughness_path: Path to roughness map
  • --metallic_path: Path to metallic map
  • --irradiance_path: Path to irradiance map
  • --output_dir: Output directory for generated RGB

Contact

For questions, feedback, or collaboration opportunities, please contact:

Email: shanlins@uci.edu, chenyu.you@stonybrook.edu

Citation

If you use this code in your research, please cite with:

@article{sun2025ouroboros,
  title={Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering},
  author={Sun, Shanlin and Wang, Yifan and Zhang, Hanwen and Xiong, Yifeng and Ren, Qin and Fang, Ruogu and Xie, Xiaohui and You, Chenyu},
  journal={arXiv preprint arXiv:2508.14461},
  year={2025}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Official Repository for Ouroboros - ICCV 2025

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors