TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
3D Instance Segmentation	S3DIS	SPFormer	mRec	71.1	# 4
3D Instance Segmentation	S3DIS	SPFormer	mPrec	74.0	# 5
3D Instance Segmentation	S3DIS	SPFormer	AP@50	69.2	# 7
3D Instance Segmentation	ScanNet(v2)	SPFormer	mAP	54.9	# 7
3D Instance Segmentation	ScanNet(v2)	SPFormer	mAP @ 50	77.0	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/superpoint-transformer-for-3d-scene-instance/3d-instance-segmentation-on-scannetv2)](https://paperswithcode.com/sota/3d-instance-segmentation-on-scannetv2?p=superpoint-transformer-for-3d-scene-instance)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/superpoint-transformer-for-3d-scene-instance/3d-instance-segmentation-on-s3dis)](https://paperswithcode.com/sota/3d-instance-segmentation-on-s3dis?p=superpoint-transformer-for-3d-scene-instance)`

Superpoint Transformer for 3D Scene Instance Segmentation

28 Nov 2022 · Jiahao Sun, Chunmei Qing, Junpeng Tan, Xiangmin Xu ·

Most existing methods realize 3D instance segmentation by extending those models used for 3D object detection or 3D semantic segmentation. However, these non-straightforward methods suffer from two drawbacks: 1) Imprecise bounding boxes or unsatisfactory semantic predictions limit the performance of the overall 3D instance segmentation framework. 2) Existing method requires a time-consuming intermediate step of aggregation. To address these issues, this paper proposes a novel end-to-end 3D instance segmentation method based on Superpoint Transformer, named as SPFormer. It groups potential features from point clouds into superpoints, and directly predicts instances through query vectors without relying on the results of object detection or semantic segmentation. The key step in this framework is a novel query decoder with transformers that can capture the instance information through the superpoint cross-attention mechanism and generate the superpoint masks of the instances. Through bipartite matching based on superpoint masks, SPFormer can implement the network training without the intermediate aggregation step, which accelerates the network. Extensive experiments on ScanNetv2 and S3DIS benchmarks verify that our method is concise yet efficient. Notably, SPFormer exceeds compared state-of-the-art methods by 4.3% on ScanNetv2 hidden test set in terms of mAP and keeps fast inference speed (247ms per frame) simultaneously. Code is available at https://github.com/sunjiahao1999/SPFormer.

PDF Abstract

Code

Add Remove Mark official

sunjiahao1999/spformer official

108

Tasks

Add Remove

3D Instance Segmentation

3D Object Detection

3D Semantic Segmentation

Instance Segmentation

object-detection

Object Detection

Segmentation

Semantic Segmentation

Datasets

ScanNet

S3DIS

Results from the Paper

Edit

Ranked #5 on 3D Instance Segmentation on ScanNet(v2)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
3D Instance Segmentation	S3DIS	SPFormer	mRec	71.1	# 4	Compare
			mPrec	74.0	# 5	Compare
			AP@50	69.2	# 7	Compare
3D Instance Segmentation	ScanNet(v2)	SPFormer	mAP	54.9	# 7	Compare
3D Instance Segmentation	ScanNet(v2)	SPFormer	mAP @ 50	77.0	# 5	Compare

Methods

Add Remove

Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • SPEED • Test • Transformer

Edit Social Preview

Superpoint Transformer for 3D Scene Instance Segmentation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove