This repository implements a lightweight adaptive scaling framework to correct scale ambiguity in monocular depth estimation. The method learns to predict an image-specific scaling factor that, when applied to relative depth predictions (e.g., from Depth Anything V2), significantly improves metric depth accuracy — all without extra sensors or modifications to the base depth model.
git clone --recursive https://github.com/tae-h-yang/adaptive-depth-estimation.git
cd adaptive-depth-estimationBe sure to source the setup script before running anything:
source setup.shThis installs the required dependencies and sets up environment variables.
Download the NYU Depth V2 dataset using these instructions, and place the extracted files into the datasets/ directory:
Place the pretrained Depth Anything V2 checkpoint in the checkpoints/ directory:
This project introduces a two-stage solution to improve monocular metric depth prediction:
- Offline Optimization: Compute the optimal per-image scale by minimizing the Wasserstein distance between predicted and ground-truth depth distributions.
- Online Correction: Train a lightweight CNN to predict a log-scale correction factor directly from an RGB image.
The predicted scaling factor is applied to the base model's output at inference time, yielding improved metric accuracy in diverse scenes.
adaptive_depth/ # Core implementation of the adaptive scaling model
Depth-Anything-V2/ # Submodule for Depth Anything V2
datasets/ # NYU Depth V2 dataset (user-supplied)
checkpoints/ # Pretrained model weights
figures/ # Visualizations and plots
setup.sh # Setup script for environment
This repository is open-sourced under the MIT License. See LICENSE for details.