TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Surface Normals Estimation	NYU Depth v2	PolyMaX(ConvNeXt-L)	% < 11.25	65.66	# 2
Surface Normals Estimation	NYU Depth v2	PolyMaX(ConvNeXt-L)	% < 22.5	82.28	# 2
Surface Normals Estimation	NYU Depth v2	PolyMaX(ConvNeXt-L)	% < 30	87.83	# 2
Surface Normals Estimation	NYU Depth v2	PolyMaX(ConvNeXt-L)	Mean Angle Error	13.09	# 2
Surface Normals Estimation	NYU Depth v2	PolyMaX(ConvNeXt-L)	RMSE	20.4	# 3
Semantic Segmentation	NYU Depth v2	PolyMaX(ConvNeXt-L)	Mean IoU	58.08%	# 5
Monocular Depth Estimation	NYU-Depth V2	PolyMaX(ConvNeXt-L)	RMSE	0.25	# 11
Monocular Depth Estimation	NYU-Depth V2	PolyMaX(ConvNeXt-L)	absolute relative error	0.067	# 11
Monocular Depth Estimation	NYU-Depth V2	PolyMaX(ConvNeXt-L)	Delta < 1.25	0.969	# 9
Monocular Depth Estimation	NYU-Depth V2	PolyMaX(ConvNeXt-L)	Delta < 1.25^2	0.9958	# 10
Monocular Depth Estimation	NYU-Depth V2	PolyMaX(ConvNeXt-L)	Delta < 1.25^3	0.999	# 4
Monocular Depth Estimation	NYU-Depth V2	PolyMaX(ConvNeXt-L)	log 10	0.029	# 10

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/polymax-general-dense-prediction-with-mask/surface-normals-estimation-on-nyu-depth-v2-1)](https://paperswithcode.com/sota/surface-normals-estimation-on-nyu-depth-v2-1?p=polymax-general-dense-prediction-with-mask)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/polymax-general-dense-prediction-with-mask/semantic-segmentation-on-nyu-depth-v2)](https://paperswithcode.com/sota/semantic-segmentation-on-nyu-depth-v2?p=polymax-general-dense-prediction-with-mask)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/polymax-general-dense-prediction-with-mask/monocular-depth-estimation-on-nyu-depth-v2)](https://paperswithcode.com/sota/monocular-depth-estimation-on-nyu-depth-v2?p=polymax-general-dense-prediction-with-mask)`

PolyMaX: General Dense Prediction with Mask Transformer

9 Nov 2023 · Xuan Yang, Liangzhe Yuan, Kimberly Wilber, Astuti Sharma, Xiuye Gu, Siyuan Qiao, Stephanie Debats, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko, Liang-Chieh Chen ·

Dense prediction tasks, such as semantic segmentation, depth estimation, and surface normal prediction, can be easily formulated as per-pixel classification (discrete outputs) or regression (continuous outputs). This per-pixel prediction paradigm has remained popular due to the prevalence of fully convolutional networks. However, on the recent frontier of segmentation task, the community has been witnessing a shift of paradigm from per-pixel prediction to cluster-prediction with the emergence of transformer architectures, particularly the mask transformers, which directly predicts a label for a mask instead of a pixel. Despite this shift, methods based on the per-pixel prediction paradigm still dominate the benchmarks on the other dense prediction tasks that require continuous outputs, such as depth estimation and surface normal prediction. Motivated by the success of DORN and AdaBins in depth estimation, achieved by discretizing the continuous output space, we propose to generalize the cluster-prediction based method to general dense prediction tasks. This allows us to unify dense prediction tasks with the mask transformer framework. Remarkably, the resulting model PolyMaX demonstrates state-of-the-art performance on three benchmarks of NYUD-v2 dataset. We hope our simple yet effective design can inspire more research on exploiting mask transformers for more dense prediction tasks. Code and model will be made available.

PDF Abstract

Code

Add Remove Mark official

google-research/deeplab2 official

↳ Quickstart in

Colab

988

Tasks

Add Remove

Depth Estimation

Monocular Depth Estimation

Semantic Segmentation

Surface Normal Estimation

Surface Normals Estimation

Datasets

NYUv2

Taskonomy

Results from the Paper

Edit

Ranked #2 on Surface Normals Estimation on NYU Depth v2

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Surface Normals Estimation	NYU Depth v2	PolyMaX(ConvNeXt-L)	% < 11.25	65.66	# 2	Compare
			% < 22.5	82.28	# 2	Compare
			% < 30	87.83	# 2	Compare
			Mean Angle Error	13.09	# 2	Compare
			RMSE	20.4	# 3	Compare
Semantic Segmentation	NYU Depth v2	PolyMaX(ConvNeXt-L)	Mean IoU	58.08%	# 5	Compare
Monocular Depth Estimation	NYU-Depth V2	PolyMaX(ConvNeXt-L)	RMSE	0.25	# 11	Compare
			absolute relative error	0.067	# 11	Compare
			Delta < 1.25	0.969	# 9	Compare
			Delta < 1.25^2	0.9958	# 10	Compare
			Delta < 1.25^3	0.999	# 4	Compare
			log 10	0.029	# 10	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

PolyMaX: General Dense Prediction with Mask Transformer

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove