TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	ScanNet	ODIN	test mIoU	74.4	# 13
Semantic Segmentation	ScanNet	ODIN	val mIoU	77.8	# 2
3D Semantic Segmentation	ScanNet200	ODIN	val mIoU	40.5	# 1
3D Semantic Segmentation	ScanNet200	ODIN	test mIoU	36.8	# 2
3D Instance Segmentation	ScanNet200	ODIN	mAP	31.5	# 1
3D Instance Segmentation	ScanNet(v2)	ODIN	mAP	50.0	# 11
3D Instance Segmentation	ScanNet(v2)	ODIN	mAP @ 50	71.0	# 11
3D Instance Segmentation	ScanNet(v2)	ODIN	mAP@25	83.6	# 8

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/odin-a-single-model-for-2d-and-3d-perception/3d-instance-segmentation-on-scannet200)](https://paperswithcode.com/sota/3d-instance-segmentation-on-scannet200?p=odin-a-single-model-for-2d-and-3d-perception)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/odin-a-single-model-for-2d-and-3d-perception/3d-semantic-segmentation-on-scannet200)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-scannet200?p=odin-a-single-model-for-2d-and-3d-perception)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/odin-a-single-model-for-2d-and-3d-perception/3d-instance-segmentation-on-scannetv2)](https://paperswithcode.com/sota/3d-instance-segmentation-on-scannetv2?p=odin-a-single-model-for-2d-and-3d-perception)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/odin-a-single-model-for-2d-and-3d-perception/semantic-segmentation-on-scannet)](https://paperswithcode.com/sota/semantic-segmentation-on-scannet?p=odin-a-single-model-for-2d-and-3d-perception)`

ODIN: A Single Model for 2D and 3D Perception

4 Jan 2024 · Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki ·

State-of-the-art models on contemporary 3D perception benchmarks like ScanNet consume and label dataset-provided 3D point clouds, obtained through post processing of sensed multiview RGB-D images. They are typically trained in-domain, forego large-scale 2D pre-training and outperform alternatives that featurize the posed RGB-D multiview images instead. The gap in performance between methods that consume posed images versus post-processed 3D point clouds has fueled the belief that 2D and 3D perception require distinct model architectures. In this paper, we challenge this view and propose ODIN (Omni-Dimensional INstance segmentation), a model that can segment and label both 2D RGB images and 3D point clouds, using a transformer architecture that alternates between 2D within-view and 3D cross-view information fusion. Our model differentiates 2D and 3D feature operations through the positional encodings of the tokens involved, which capture pixel coordinates for 2D patch tokens and 3D coordinates for 3D feature tokens. ODIN achieves state-of-the-art performance on ScanNet200, Matterport3D and AI2THOR 3D instance segmentation benchmarks, and competitive performance on ScanNet, S3DIS and COCO. It outperforms all previous works by a wide margin when the sensed 3D point cloud is used in place of the point cloud sampled from 3D mesh. When used as the 3D perception engine in an instructable embodied agent architecture, it sets a new state-of-the-art on the TEACh action-from-dialogue benchmark. Our code and checkpoints can be found at the project website: https://odin-seg.github.io.

PDF Abstract

Code

Add Remove Mark official

ayushjain1144/odin official

Tasks

Add Remove

3D Instance Segmentation

3D Semantic Segmentation

Instance Segmentation

Semantic Segmentation

Datasets

MS COCO

ScanNet

Matterport3D

AI2-THOR

ALFRED

TEACh ScanNet200

Results from the Paper

Add Remove

Ranked #1 on 3D Instance Segmentation on ScanNet200

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	ScanNet	ODIN	test mIoU	74.4	# 13	Compare
Semantic Segmentation	ScanNet	ODIN	val mIoU	77.8	# 2	Compare
3D Semantic Segmentation	ScanNet200	ODIN	val mIoU	40.5	# 1	Compare
3D Semantic Segmentation	ScanNet200	ODIN	test mIoU	36.8	# 2	Compare
3D Instance Segmentation	ScanNet200	ODIN	mAP	31.5	# 1	Compare
3D Instance Segmentation	ScanNet(v2)	ODIN	mAP	50.0	# 11	Compare
			mAP @ 50	71.0	# 11	Compare
			mAP@25	83.6	# 8	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

ODIN: A Single Model for 2D and 3D Perception

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove