SMAR-Net

This repository contains the source code for paper "Self-supervised Multi-scale Adversarial Regression Network for Stereo Disparity Estimation"

Abstract

Deep learning approaches have significantly contributed to recent progress in stereo matching. These deep stereo matching methods are usually based on supervised training, which requires a large amount of high- quality ground truth depth map annotation data. However, collecting large amounts of ground truth depth data can be very expensive. Further, only a limited quantity of stereo vision training data is currently available, obtained either by active sensors (Lidar, ToF cameras) or through computer graphics simulations and not meeting requirements for deep supervised training. Here, we propose a novel deep stereo approach called ``Self-supervised Multi-scale Adversarial Regression Network (SMAR-Net)", which relaxes the need for ground truth depth maps for training. Specifically, we design a two-stage network: the first stage is a disparity regressor, in which the regression network estimates disparity values from stacked stereo image pairs; stereo image stacking method is a novel contribution as it not only contains the spatial appearances of stereo images but also implies matching correspondences with different disparity values. In the second stage, a synthetic left image is generated based on the left-right consistency assumption. Our network is trained by minimizing a hybrid loss function composed of a content-loss and an adversarial-loss. The content loss minimizes the average warping error between the synthetic images and the real ones. In contrast to the conventional generative adversarial loss, our proposed adversarial loss penalizes mismatches using multi-scale features. This constrains the synthetic image and real image as being pixel-wise identical instead of just belonging to the same distribution. Further, the combined utilization of multi-scale feature extraction in both the content loss and adversarial loss further improves the adaptability of SMAR-Net in ill-posed regions. Experiments on multiple benchmark datasets show that SMAR-Net outperforms the current state-of-the-art self-supervised methods and achieves comparable outcomes to supervised methods.

Dependencies

Results on KITTI2015 leaderboard

leaderboard website

Compared with self-supervised methods

We compared SMAR-Net with three self-supervised stereo matching methods. We used a test split of KITTI2015 to train networks in self-supervised manner and used training split with ground truth disparity to evaluate the performance of all the methods.

Method	>2px	>3px	>5px	Mean error
Zhou et al	4.85%	3.56%	2.43%	0.74px
Tonioni et al	4.01 %	2.73 %	2.05%	0.67px
Lai et al	3.91 %	2.58%	1.83%	0.63px
SMAR-Net	3.71%	2.42%	1.75%	0.61px

Visualized results on KITTI VO dataset

We tested SMAR-Net using KITTI VO dataset.

Visualized results on Beihang Dataset

We also collected almost 10000 pairs of stereo images in Beihang University to train and test SMAR-Net. Here is the visualized results.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.idea		.idea
Images		Images
data_loader		data_loader
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SMAR-Net

Abstract

Dependencies

Results on KITTI2015 leaderboard

Compared with self-supervised methods

Visualized results on KITTI VO dataset

Visualized results on Beihang Dataset

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SMAR-Net

Abstract

Dependencies

Results on KITTI2015 leaderboard

Compared with self-supervised methods

Visualized results on KITTI VO dataset

Visualized results on Beihang Dataset

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages