Monocular Depth Estimation

337 papers with code • 18 benchmarks • 26 datasets

Monocular Depth Estimation is the task of estimating the depth value (distance relative to the camera) of each pixel given a single (monocular) RGB image. This challenging task is a key prerequisite for determining scene understanding for applications such as 3D scene reconstruction, autonomous driving, and AR. State-of-the-art methods usually fall into one of two categories: designing a complex network that is powerful enough to directly regress the depth map, or splitting the input into bins or windows to reduce computational complexity. The most popular benchmarks are the KITTI and NYUv2 datasets. Models are typically evaluated using RMSE or absolute relative error.

Source: Defocus Deblurring Using Dual-Pixel Data

Benchmarks

Add a Result

These leaderboards are used to track progress in Monocular Depth Estimation

Dataset	Best Model	Compare
NYU-Depth V2	UniDepth (Zero-shot)	See all
KITTI Eigen split	LightedDepth (Video Method)	See all
KITTI Eigen split unsupervised	SQLdepth (ConvNeXt-L)	See all
NYU-Depth V2 self-supervised	IndoorDepth	See all
Mid-Air Dataset	M4Depth+U	See all
Make3D	GCNDepth	See all
IBims-1	LeReS	See all
DDAD	AFNet	See all
VA (Virtual Apartment)	DistDepth	See all
Middlebury 2014	Miangoleh et al. (MiDaS)	See all
KITTI	MonoViT	See all
SUN-RGBD	RPSF	See all
Cityscapes	SwinMTL	See all
UASOL	FCRN-DepthPrediction from Iro Laina et al. (2016)	See all
KITTI Object Tracking Evaluation 2012	PackNet-SfM	See all
Matterport3D	NeWCRFs	See all
Cityscapes 3D	TaskPrompter	See all

Show all 18 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Monocular Depth Estimation models and implementations

huggingface/transformers

3 papers

124,527

SeokjuLee/Insta-DM

3 papers

220

ShuweiShao/NDDepth

3 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras

google-research/google-research • • ICCV 2019

We present a novel method for simultaneous learning of depth, egomotion, object motion, and camera intrinsics from monocular videos, using only consistency across neighboring video frames as supervision signal.

Paper
Code

3D Packing for Self-Supervised Monocular Depth Estimation

TRI-ML/packnet-sfm • • CVPR 2020

Although cameras are ubiquitous, robotic platforms typically rely on active sensors like LiDAR for direct 3D perception.

Paper
Code

Towards Better Generalization: Joint Depth-Pose Learning without PoseNet

B1ueber2y/TrianFlow • • CVPR 2020

In this work, we tackle the essential problem of scale inconsistency for self-supervised joint depth-pose learning.

Paper
Code

S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation

microsoft/S2R-DepthNet • • CVPR 2021

S2R-DepthNet consists of: a) a Structure Extraction (STE) module which extracts a domaininvariant structural representation from an image by disentangling the image into domain-invariant structure and domain-specific style components, b) a Depth-specific Attention (DSA) module, which learns task-specific knowledge to suppress depth-irrelevant structures for better depth estimation and generalization, and c) a depth prediction module (DP) to predict depth from the depth-specific representation.

Paper
Code

Enforcing geometric constraints of virtual normal for depth prediction

aim-uofa/AdelaiDepth • • ICCV 2019

Monocular depth prediction plays a crucial role in understanding 3D scene geometry.

Paper
Code

Digging Into Self-Supervised Monocular Depth Estimation

nianticlabs/monodepth2 • • ICCV 2019

Per-pixel ground-truth depth data is challenging to acquire at scale.

Paper
Code

Virtual Normal: Enforcing Geometric Constraints for Accurate and Robust Depth Prediction

aim-uofa/AdelaiDepth • • 7 Mar 2021

In this work, we show the importance of the high-order 3D geometric constraints for depth prediction.

Paper
Code

Global-Local Path Networks for Monocular Depth Estimation with Vertical CutDepth

vinvino02/GLPDepth • • 19 Jan 2022

Depth estimation from a single image is an important task that can be applied to various fields in computer vision, and has grown rapidly with the development of convolutional neural networks.

Paper
Code

ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth

isl-org/ZoeDepth • • 23 Feb 2023

Finally, ZoeD-M12-NK is the first model that can jointly train on multiple datasets (NYU Depth v2 and KITTI) without a significant drop in performance and achieve unprecedented zero-shot generalization performance to eight unseen datasets from both indoor and outdoor domains.

Paper
Code

Neural Video Depth Stabilizer

raymondwang987/nvds • • ICCV 2023

Video depth estimation aims to infer temporally consistent depth.

Paper
Code

Monocular Depth Estimation

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result