This repository and its contents are not related to any official Samsung Electronics products.
All resources are provided solely for non-commercial research and education purposes.
- Scalable stereo matching architecture
- State-of-the-art performance on ETH3D (1st), Middlebury V3 (1st), and Booster (1st)
- Joint estimation of disparity, occlusion, and confidence
- Supports negative disparity estimation
- Optimal under the pinhole camera model with ideal stereo rectification (vertical disparity < 2px)
- ✅ FP16 / FP32 inference
- ✅ TorchScript/ONNX/TensorRT export
- ❌ Training pipeline (not included)
Note: The publicly released model weights differ from the version used for Middlebury and ETH but are identical to the one used in Booster benchmark. This implementation replaces the dynamic attention-based refinement module with an UNet for stable ONNX export. It also includes an additional M variant and extended training data with transparent objects.
Detailed benchmark results and visualizations are available on the Project Page.
Inference Speed (FPS)** on **NVIDIA RTX 5090 (float16 + refine_iter=3):
| Model | CH | NTR | Inference | 640x480 | 1216x1024 | 2432x2048 |
|---|---|---|---|---|---|---|
| S | 128 | 1 | torch.compile | 59.8 | 26.4 | 6.6 |
| TensorRT | 124.0 | 59.4 | 7.3 | |||
| M | 192 | 2 | torch.compile | 32.4 | 12.5 | 3.1 |
| TensorRT | 66.7 | 18.3 | 3.8 | |||
| L | 256 | 3 | torch.compile | 25.8 | 7.6 | 2.0 |
| TensorRT | 46.6 | 11.2 | 2.4 | |||
| XL | 384 | 3 | torch.compile | 16.3 | 4.3 | 1.1 |
| TensorRT | 26.6 | 6.4 | 1.4 |
We recommend using Python 3.10, PyTorch 2.9, CUDA 12.9, CUDNN 9.1.0, and tensorRT 10.13.3 with Anaconda.
git clone https://github.com/junhong-3dv/s2m2
cd s2m2conda env create -n s2m2 -f environment.yml
conda activate s2m2pip install -e .If the environment setup via .yml doesn’t work smoothly, you can manually install the main dependencies with:
pip install torch torchvision opencv-python open3d onnx onnxruntime-gpu onnxscript tensorrt-cu12==10.13.3.9 --extra-index-url https://pypi.nvidia.com/tensorrt-cu12-libs
That should cover most of the required packages for running the demo.
Create a directory for weights and download the desired models from the links below.
mkdir weights
mkdir weights/pretrain_weights| Model | Download | Model Size |
|---|---|---|
| S | Download | 26.5M |
| M | Download | 80.4M |
| L | Download | 181M |
| XL | Download | 406M |
To generate a result for a single input, run demo/visualize_2d_simple.py.
python ./demo/visualize_2d_simple.py --model_type XL --num_refine 3 | arg | default | type | help |
|---|---|---|---|
| --model_type | 'XL' | str | select model type: [S,M,L,XL] |
| --num_refine | 3 | int | number of local iterative refinement |
| --torch_compile | False | set_true | apply torch_compile |
| --allow_negative | False | set_true | allow negative disparity |
To visualize the 3D output interactively, run demo/visualize_3d_booster.py or demo/visualize_3d_middlebury.py
python ./demo/visualize_3d_booster.py --model_type L For 'visualize_3d_middlebury.py --model_type XL ', result should be like below.
If you failed to reproduce this, let me know.Use export_onnx.py to convert the model to ONNX:
python demo/export_onnx.py --model_type $MODEL_TYPE --img_width $IMG_WIDHT --img_height $IMG_HEIGHTUse export_tensorrt.py to build a TensorRT engine:
python demo/export_tensorrt.py --model_type $MODEL_TYPE --img_width $IMG_WIDHT --img_height $IMG_HEIGHT --precision $PRECISIONor simply in the terminal
trtexec --onnx={onnx_file_path} --saveEngine=./{trt_file_path} --fp16 --precisionConstraints=obey --layerPrecisions=node_linalg_vector_norm_2:fp32Supported TensorRT precisions: fp32, tf32, fp16
If you find our work useful for your research, please consider citing our paper:
@inproceedings{min2025s2m2,
title={{S\textsuperscript{2}M\textsuperscript{2}}: Scalable Stereo Matching Model for Reliable Depth Estimation},
author={Junhong Min and Youngpil Jeon and Jimin Kim and Minyong Choi},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2025}
}