GitHub - KevinyWu/s2m2: Official implementation of "S²M²: Scalable Stereo Matching Model for Reliable Depth Estimation, ICCV 2025"

S²M²: Scalable Stereo Matching Model for Reliable Depth Estimation (ICCV 2025)

Junhong Min¹^*, Youngpil Jeon¹, Jimin Kim¹, Minyong Choi¹

🤗 Notice

This repository and its contents are not related to any official Samsung Electronics products.
All resources are provided solely for non-commercial research and education purposes.

✨ Key Features

🧩 Model

Scalable stereo matching architecture
State-of-the-art performance on ETH3D (1st), Middlebury V3 (1st), and Booster (1st)
Joint estimation of disparity, occlusion, and confidence
Supports negative disparity estimation
Optimal under the pinhole camera model with ideal stereo rectification (vertical disparity < 2px)

⚙️ Code

✅ FP16 / FP32 inference
✅ TorchScript/ONNX/TensorRT export
❌ Training pipeline (not included)

Note: The publicly released model weights differ from the version used for Middlebury and ETH but are identical to the one used in Booster benchmark. This implementation replaces the dynamic attention-based refinement module with an UNet for stable ONNX export. It also includes an additional M variant and extended training data with transparent objects.

🚀 Performance

Detailed benchmark results and visualizations are available on the Project Page.

Inference Speed (FPS)** on **NVIDIA RTX 5090 (float16 + refine_iter=3):

Model	CH	NTR	Inference	640x480	1216x1024	2432x2048
S	128	1	torch.compile	59.8	26.4	6.6
S	128	1	TensorRT	124.0	59.4	7.3
M	192	2	torch.compile	32.4	12.5	3.1
M	192	2	TensorRT	66.7	18.3	3.8
L	256	3	torch.compile	25.8	7.6	2.0
L	256	3	TensorRT	46.6	11.2	2.4
XL	384	3	torch.compile	16.3	4.3	1.1
XL	384	3	TensorRT	26.6	6.4	1.4

🔧 Installation

We recommend using Python 3.10, PyTorch 2.9, CUDA 12.9, CUDNN 9.1.0, and tensorRT 10.13.3 with Anaconda.

git clone https://github.com/junhong-3dv/s2m2
cd s2m2

conda env create -n s2m2 -f environment.yml
conda activate s2m2

pip install -e .

If the environment setup via .yml doesn’t work smoothly, you can manually install the main dependencies with:

pip install torch torchvision opencv-python open3d onnx onnxruntime-gpu onnxscript tensorrt-cu12==10.13.3.9 --extra-index-url https://pypi.nvidia.com/tensorrt-cu12-libs

That should cover most of the required packages for running the demo.

🚀 Pre-trained Models and Inference

1. Download Pre-trained Models

Create a directory for weights and download the desired models from the links below.

mkdir weights
mkdir weights/pretrain_weights

Model	Download	Model Size
S	Download	26.5M
M	Download	80.4M
L	Download	181M
XL	Download	406M

2. Run Basic Demo

To generate a result for a single input, run demo/visualize_2d_simple.py.

python ./demo/visualize_2d_simple.py --model_type XL --num_refine 3

arg	default	type	help
--model_type	'XL'	str	select model type: [S,M,L,XL]
--num_refine	3	int	number of local iterative refinement
--torch_compile	False	set_true	apply torch_compile
--allow_negative	False	set_true	allow negative disparity

3. Run 3D Visualization Demo

To visualize the 3D output interactively, run demo/visualize_3d_booster.py or demo/visualize_3d_middlebury.py

python ./demo/visualize_3d_booster.py --model_type L

For 'visualize_3d_middlebury.py --model_type XL ', result should be like below.

If you failed to reproduce this, let me know.

🚀 Model Optimization (ONNX / TensorRT)

1. Export to ONNX (OPSET=18)

Use export_onnx.py to convert the model to ONNX:

python demo/export_onnx.py --model_type $MODEL_TYPE --img_width $IMG_WIDHT --img_height $IMG_HEIGHT

2. Export to TensorRT (tested with 10.13.3)

Use export_tensorrt.py to build a TensorRT engine:

python demo/export_tensorrt.py --model_type $MODEL_TYPE --img_width $IMG_WIDHT --img_height $IMG_HEIGHT --precision $PRECISION

or simply in the terminal

trtexec --onnx={onnx_file_path} --saveEngine=./{trt_file_path} --fp16 --precisionConstraints=obey --layerPrecisions=node_linalg_vector_norm_2:fp32

Supported TensorRT precisions: fp32, tf32, fp16

📜 Citation

If you find our work useful for your research, please consider citing our paper:

@inproceedings{min2025s2m2,
  title={{S\textsuperscript{2}M\textsuperscript{2}}: Scalable Stereo Matching Model for Reliable Depth Estimation},
  author={Junhong Min and Youngpil Jeon and Jimin Kim and Minyong Choi},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
data		data
demo		demo
fig		fig
src		src
LICENSE.md		LICENSE.md
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

S²M²: Scalable Stereo Matching Model for Reliable Depth Estimation (ICCV 2025)

Junhong Min¹^*, Youngpil Jeon¹, Jimin Kim¹, Minyong Choi¹

🤗 Notice

✨ Key Features

🧩 Model

⚙️ Code

🚀 Performance

🔧 Installation

🚀 Pre-trained Models and Inference

1. Download Pre-trained Models

2. Run Basic Demo

3. Run 3D Visualization Demo

🚀 Model Optimization (ONNX / TensorRT)

1. Export to ONNX (OPSET=18)

2. Export to TensorRT (tested with 10.13.3)

📜 Citation

About

Uh oh!

Releases

Packages

Languages

License

KevinyWu/s2m2

Folders and files

Latest commit

History

Repository files navigation

S2M2: Scalable Stereo Matching Model for Reliable Depth Estimation (ICCV 2025)

Junhong Min¹*, Youngpil Jeon¹, Jimin Kim¹, Minyong Choi¹

🤗 Notice

✨ Key Features

🧩 Model

⚙️ Code

🚀 Performance

🔧 Installation

🚀 Pre-trained Models and Inference

1. Download Pre-trained Models

2. Run Basic Demo

3. Run 3D Visualization Demo

🚀 Model Optimization (ONNX / TensorRT)

1. Export to ONNX (OPSET=18)

2. Export to TensorRT (tested with 10.13.3)

📜 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

S²M²: Scalable Stereo Matching Model for Reliable Depth Estimation (ICCV 2025)

Junhong Min¹^*, Youngpil Jeon¹, Jimin Kim¹, Minyong Choi¹

Packages