Combined Fusion is a general zero-shot lightweight model that can be used in indoor/outdoor scenes to predict Monocular Metric depth map.
cfModel = CombinedFusion()
cfModel.load_state_dict(torch.load('./CombinedFusion.pth', map_location='cpu'))
cfModel = cfModel.to(DEVICE).eval()and also example.py already have a full example of using model. just execute: python example.py.
a comparison FPS on random video on internet, you can look at the FPS counter and also so many details for instance: the hair of man before pushing the ball.
fps_compare.mp4
Use DepthAnythingV2 metric fine tune codes, and just replace the model with our model CombinedFusion folder.
This work is related to a paper currently under review. We will update this section with the official citation once the paper is accepted.