Skip to content
/ s2m2 Public
forked from junhong-3dv/s2m2

Official implementation of "S²M²: Scalable Stereo Matching Model for Reliable Depth Estimation, ICCV 2025"

License

Notifications You must be signed in to change notification settings

KevinyWu/s2m2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

S2M2: Scalable Stereo Matching Model for Reliable Depth Estimation (ICCV 2025)

Junhong Min¹*, Youngpil Jeon¹, Jimin Kim¹, Minyong Choi¹

Paper Project Page

🤗 Notice

This repository and its contents are not related to any official Samsung Electronics products.
All resources are provided solely for non-commercial research and education purposes.


✨ Key Features

🧩 Model

  • Scalable stereo matching architecture
  • State-of-the-art performance on ETH3D (1st), Middlebury V3 (1st), and Booster (1st)
  • Joint estimation of disparity, occlusion, and confidence
  • Supports negative disparity estimation
  • Optimal under the pinhole camera model with ideal stereo rectification (vertical disparity < 2px)

⚙️ Code

  • ✅ FP16 / FP32 inference
  • ✅ TorchScript/ONNX/TensorRT export
  • ❌ Training pipeline (not included)

Note: The publicly released model weights differ from the version used for Middlebury and ETH but are identical to the one used in Booster benchmark. This implementation replaces the dynamic attention-based refinement module with an UNet for stable ONNX export. It also includes an additional M variant and extended training data with transparent objects.


🚀 Performance

Detailed benchmark results and visualizations are available on the Project Page.

Inference Speed (FPS)** on **NVIDIA RTX 5090 (float16 + refine_iter=3):

Model CH NTR Inference 640x480 1216x1024 2432x2048
S 128 1 torch.compile 59.8 26.4 6.6
TensorRT 124.0 59.4 7.3
M 192 2 torch.compile 32.4 12.5 3.1
TensorRT 66.7 18.3 3.8
L 256 3 torch.compile 25.8 7.6 2.0
TensorRT 46.6 11.2 2.4
XL 384 3 torch.compile 16.3 4.3 1.1
TensorRT 26.6 6.4 1.4

🔧 Installation

We recommend using Python 3.10, PyTorch 2.9, CUDA 12.9, CUDNN 9.1.0, and tensorRT 10.13.3 with Anaconda.

git clone https://github.com/junhong-3dv/s2m2
cd s2m2
conda env create -n s2m2 -f environment.yml
conda activate s2m2
pip install -e .

If the environment setup via .yml doesn’t work smoothly, you can manually install the main dependencies with:

pip install torch torchvision opencv-python open3d onnx onnxruntime-gpu onnxscript tensorrt-cu12==10.13.3.9 --extra-index-url https://pypi.nvidia.com/tensorrt-cu12-libs

That should cover most of the required packages for running the demo.

🚀 Pre-trained Models and Inference

1. Download Pre-trained Models

Create a directory for weights and download the desired models from the links below.

mkdir weights
mkdir weights/pretrain_weights
Model Download Model Size
S Download 26.5M
M Download 80.4M
L Download 181M
XL Download 406M

2. Run Basic Demo

To generate a result for a single input, run demo/visualize_2d_simple.py.

python ./demo/visualize_2d_simple.py --model_type XL --num_refine 3 
arg default type help
--model_type 'XL' str select model type: [S,M,L,XL]
--num_refine 3 int number of local iterative refinement
--torch_compile False set_true apply torch_compile
--allow_negative False set_true allow negative disparity

3. Run 3D Visualization Demo

To visualize the 3D output interactively, run demo/visualize_3d_booster.py or demo/visualize_3d_middlebury.py

python ./demo/visualize_3d_booster.py --model_type L 

For 'visualize_3d_middlebury.py --model_type XL ', result should be like below.

If you failed to reproduce this, let me know.

🚀 Model Optimization (ONNX / TensorRT)

1. Export to ONNX (OPSET=18)

Use export_onnx.py to convert the model to ONNX:

python demo/export_onnx.py --model_type $MODEL_TYPE --img_width $IMG_WIDHT --img_height $IMG_HEIGHT

2. Export to TensorRT (tested with 10.13.3)

Use export_tensorrt.py to build a TensorRT engine:

python demo/export_tensorrt.py --model_type $MODEL_TYPE --img_width $IMG_WIDHT --img_height $IMG_HEIGHT --precision $PRECISION

or simply in the terminal

trtexec --onnx={onnx_file_path} --saveEngine=./{trt_file_path} --fp16 --precisionConstraints=obey --layerPrecisions=node_linalg_vector_norm_2:fp32

Supported TensorRT precisions: fp32, tf32, fp16

📜 Citation

If you find our work useful for your research, please consider citing our paper:

@inproceedings{min2025s2m2,
  title={{S\textsuperscript{2}M\textsuperscript{2}}: Scalable Stereo Matching Model for Reliable Depth Estimation},
  author={Junhong Min and Youngpil Jeon and Jimin Kim and Minyong Choi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2025}
}

About

Official implementation of "S²M²: Scalable Stereo Matching Model for Reliable Depth Estimation, ICCV 2025"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%