TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Monocular Depth Estimation	KITTI Eigen split	Marigold	absolute relative error	0.099	# 49
Monocular Depth Estimation	KITTI Eigen split	Marigold	RMSE	3.304	# 38
Monocular Depth Estimation	KITTI Eigen split	Marigold	RMSE log	0.138	# 34
Monocular Depth Estimation	KITTI Eigen split	Marigold	Delta < 1.25	0.916	# 35
Monocular Depth Estimation	KITTI Eigen split	Marigold	Delta < 1.25^2	0.987	# 33
Monocular Depth Estimation	KITTI Eigen split	Marigold	Delta < 1.25^3	0.996	# 34
Monocular Depth Estimation	NYU-Depth V2	Marigold	RMSE	0.224	# 6
Monocular Depth Estimation	NYU-Depth V2	Marigold	absolute relative error	0.055	# 2
Monocular Depth Estimation	NYU-Depth V2	Marigold	Delta < 1.25	0.964	# 10
Monocular Depth Estimation	NYU-Depth V2	Marigold	Delta < 1.25^2	0.991	# 21
Monocular Depth Estimation	NYU-Depth V2	Marigold	Delta < 1.25^3	0.998	# 18
Monocular Depth Estimation	NYU-Depth V2	Marigold	log 10	0.024	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/repurposing-diffusion-based-image-generators/monocular-depth-estimation-on-nyu-depth-v2)](https://paperswithcode.com/sota/monocular-depth-estimation-on-nyu-depth-v2?p=repurposing-diffusion-based-image-generators)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/repurposing-diffusion-based-image-generators/monocular-depth-estimation-on-kitti-eigen)](https://paperswithcode.com/sota/monocular-depth-estimation-on-kitti-eigen?p=repurposing-diffusion-based-image-generators)`

Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

4 Dec 2023 · Bingxin Ke, Anton Obukhov, Shengyu Huang, Nando Metzger, Rodrigo Caye Daudt, Konrad Schindler ·

Monocular depth estimation is a fundamental computer vision task. Recovering 3D depth from a single image is geometrically ill-posed and requires scene understanding, so it is not surprising that the rise of deep learning has led to a breakthrough. The impressive progress of monocular depth estimators has mirrored the growth in model capacity, from relatively modest CNNs to large Transformer architectures. Still, monocular depth estimators tend to struggle when presented with images with unfamiliar content and layout, since their knowledge of the visual world is restricted by the data seen during training, and challenged by zero-shot generalization to new domains. This motivates us to explore whether the extensive priors captured in recent generative diffusion models can enable better, more generalizable depth estimation. We introduce Marigold, a method for affine-invariant monocular depth estimation that is derived from Stable Diffusion and retains its rich prior knowledge. The estimator can be fine-tuned in a couple of days on a single GPU using only synthetic training data. It delivers state-of-the-art performance across a wide range of datasets, including over 20% performance gains in specific cases. Project page: https://marigoldmonodepth.github.io.

PDF Abstract

Code

Add Remove Mark official

prs-eth/marigold official

↳ Quickstart in

Colab

Spaces

1,672

Magicboomliu/Accelerator-Simple-Tem…

damaggu/tadp

↳ Quickstart in

Colab

Tasks

Add Remove

Depth Estimation

Monocular Depth Estimation

Scene Understanding

Zero-shot Generalization

Datasets

KITTI

NYUv2

LAION-5B

Hypersim

DIODE

Results from the Paper

Edit

Ranked #6 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Monocular Depth Estimation	KITTI Eigen split	Marigold	absolute relative error	0.099	# 49	Compare
			RMSE	3.304	# 38	Compare
			RMSE log	0.138	# 34	Compare
			Delta < 1.25	0.916	# 35	Compare
			Delta < 1.25^2	0.987	# 33	Compare
			Delta < 1.25^3	0.996	# 34	Compare
Monocular Depth Estimation	NYU-Depth V2	Marigold	RMSE	0.224	# 6	Compare
			absolute relative error	0.055	# 2	Compare
			Delta < 1.25	0.964	# 10	Compare
			Delta < 1.25^2	0.991	# 21	Compare
			Delta < 1.25^3	0.998	# 18	Compare
			log 10	0.024	# 2	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Diffusion • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove