TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
3D Point Cloud Classification	ModelNet40	PTv2	Overall Accuracy	94.2	# 14
3D Point Cloud Classification	ModelNet40	PTv2	Mean Accuracy	91.6	# 8
LIDAR Semantic Segmentation	nuScenes	PTv2	test mIoU	0.826	# 2
LIDAR Semantic Segmentation	nuScenes	PTv2	val mIoU	0.802	# 2
3D Semantic Segmentation	nuScenes	PTv2	mIoU	82.6%	# 1
3D Semantic Segmentation	S3DIS	PointTransformerV2	mIoU (Area-5)	71.6	# 2
Semantic Segmentation	S3DIS Area5	PTv2	mIoU	72.6	# 10
Semantic Segmentation	S3DIS Area5	PTv2	oAcc	91.6	# 8
Semantic Segmentation	S3DIS Area5	PTv2	mAcc	78.0	# 14
Semantic Segmentation	S3DIS Area5	PTv2	Number of params	N/A	# 2
Semantic Segmentation	ScanNet	PTv2	test mIoU	75.2	# 10
Semantic Segmentation	ScanNet	PTv2	val mIoU	75.4	# 10
3D Semantic Segmentation	SemanticKITTI	PTv2	test mIoU	72.6%	# 7
3D Semantic Segmentation	SemanticKITTI	PTv2	val mIoU	70.3%	# 6

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v2-grouped-vector-attention/3d-semantic-segmentation-on-nuscenes)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-nuscenes?p=point-transformer-v2-grouped-vector-attention)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v2-grouped-vector-attention/lidar-semantic-segmentation-on-nuscenes)](https://paperswithcode.com/sota/lidar-semantic-segmentation-on-nuscenes?p=point-transformer-v2-grouped-vector-attention)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v2-grouped-vector-attention/3d-semantic-segmentation-on-s3dis)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-s3dis?p=point-transformer-v2-grouped-vector-attention)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v2-grouped-vector-attention/3d-semantic-segmentation-on-semantickitti)](https://paperswithcode.com/sota/3d-semantic-segmentation-on-semantickitti?p=point-transformer-v2-grouped-vector-attention)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v2-grouped-vector-attention/semantic-segmentation-on-s3dis-area5)](https://paperswithcode.com/sota/semantic-segmentation-on-s3dis-area5?p=point-transformer-v2-grouped-vector-attention)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v2-grouped-vector-attention/semantic-segmentation-on-scannet)](https://paperswithcode.com/sota/semantic-segmentation-on-scannet?p=point-transformer-v2-grouped-vector-attention)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/point-transformer-v2-grouped-vector-attention/3d-point-cloud-classification-on-modelnet40)](https://paperswithcode.com/sota/3d-point-cloud-classification-on-modelnet40?p=point-transformer-v2-grouped-vector-attention)`

Point Transformer V2: Grouped Vector Attention and Partition-based Pooling

11 Oct 2022 · Xiaoyang Wu, Yixing Lao, Li Jiang, Xihui Liu, Hengshuang Zhao ·

As a pioneering work exploring transformer architecture for 3D point cloud understanding, Point Transformer achieves impressive results on multiple highly competitive benchmarks. In this work, we analyze the limitations of the Point Transformer and propose our powerful and efficient Point Transformer V2 model with novel designs that overcome the limitations of previous work. In particular, we first propose group vector attention, which is more effective than the previous version of vector attention. Inheriting the advantages of both learnable weight encoding and multi-head attention, we present a highly effective implementation of grouped vector attention with a novel grouped weight encoding layer. We also strengthen the position information for attention by an additional position encoding multiplier. Furthermore, we design novel and lightweight partition-based pooling methods which enable better spatial alignment and more efficient sampling. Extensive experiments show that our model achieves better performance than its predecessor and achieves state-of-the-art on several challenging 3D point cloud understanding benchmarks, including 3D point cloud segmentation on ScanNet v2 and S3DIS and 3D point cloud classification on ModelNet40. Our code will be available at https://github.com/Gofinge/PointTransformerV2.

PDF Abstract

Code

Add Remove Mark official

Pointcept/Pointcept official

1,108

Pointcept/PointTransformerV2 official

329

Tasks

Add Remove

3D Point Cloud Classification

3D Semantic Segmentation

LIDAR Semantic Segmentation

Point Cloud Classification

Point Cloud Segmentation

Position

Semantic Segmentation

Semantic Segmentation on ScanNet

Datasets

nuScenes

ModelNet

ScanNet

SemanticKITTI

S3DIS

Results from the Paper

Edit

Ranked #1 on 3D Semantic Segmentation on nuScenes

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
3D Point Cloud Classification	ModelNet40	PTv2	Overall Accuracy	94.2	# 14	Compare
3D Point Cloud Classification	ModelNet40	PTv2	Mean Accuracy	91.6	# 8	Compare
LIDAR Semantic Segmentation	nuScenes	PTv2	test mIoU	0.826	# 2	Compare
LIDAR Semantic Segmentation	nuScenes	PTv2	val mIoU	0.802	# 2	Compare
3D Semantic Segmentation	nuScenes	PTv2	mIoU	82.6%	# 1	Compare
3D Semantic Segmentation	S3DIS	PointTransformerV2	mIoU (Area-5)	71.6	# 2	Compare
Semantic Segmentation	S3DIS Area5	PTv2	mIoU	72.6	# 10	Compare
			oAcc	91.6	# 8	Compare
			mAcc	78.0	# 14	Compare
			Number of params	N/A	# 2	Compare
Semantic Segmentation	ScanNet	PTv2	test mIoU	75.2	# 10	Compare
Semantic Segmentation	ScanNet	PTv2	val mIoU	75.4	# 10	Compare
3D Semantic Segmentation	SemanticKITTI	PTv2	test mIoU	72.6%	# 7	Compare
3D Semantic Segmentation	SemanticKITTI	PTv2	val mIoU	70.3%	# 6	Compare

Methods

Add Remove

Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

Point Transformer V2: Grouped Vector Attention and Partition-based Pooling

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove