Point Transformer

Self-attention networks have revolutionized natural language processing and are making impressive strides in image analysis tasks such as image classification and object detection. Inspired by this success, we investigate the application of self-attention networks to 3D point cloud processing... We design self-attention layers for point clouds and use these to construct self-attention networks for tasks such as semantic scene segmentation, object part segmentation, and object classification. Our Point Transformer design improves upon prior work across domains and tasks. For example, on the challenging S3DIS dataset for large-scale semantic scene segmentation, the Point Transformer attains an mIoU of 70.4% on Area 5, outperforming the strongest prior model by 3.3 absolute percentage points and crossing the 70% mIoU threshold for the first time. read more

PDF Abstract ICCV 2021 PDF ICCV 2021 Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Point Cloud Classification ModelNet40 PointTransformer Overall Accuracy 93.7 # 11
Mean Accuracy 90.6 # 6
Semantic Segmentation S3DIS PointTransformer Mean IoU 73.5 # 1
mAcc 81.9 # 5
oAcc 90.2 # 1
Semantic Segmentation S3DIS Area5 PointTransformer mIoU 70.4 # 1
oAcc 90.8 # 1
mAcc 76.5 # 1
3D Part Segmentation ShapeNet-Part PointTransformer Class Average IoU 83.7 # 10
Instance Average IoU 86.6 # 5

Methods