OneFormer3D: One Transformer for Unified Point Cloud Segmentation

24 Nov 2023  ·  Maxim Kolodiazhnyi, Anna Vorontsova, Anton Konushin, Danila Rukhovich ·

Semantic, instance, and panoptic segmentation of 3D point clouds have been addressed using task-specific models of distinct design. Thereby, the similarity of all segmentation tasks and the implicit relationship between them have not been utilized effectively. This paper presents a unified, simple, and effective model addressing all these tasks jointly. The model, named OneFormer3D, performs instance and semantic segmentation consistently, using a group of learnable kernels, where each kernel is responsible for generating a mask for either an instance or a semantic category. These kernels are trained with a transformer-based decoder with unified instance and semantic queries passed as an input. Such a design enables training a model end-to-end in a single run, so that it achieves top performance on all three segmentation tasks simultaneously. Specifically, our OneFormer3D ranks 1st and sets a new state-of-the-art (+2.1 mAP50) in the ScanNet test leaderboard. We also demonstrate the state-of-the-art results in semantic, instance, and panoptic segmentation of ScanNet (+21 PQ), ScanNet200 (+3.8 mAP50), and S3DIS (+0.8 mIoU) datasets.

PDF Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Semantic Segmentation S3DIS OneFormer3D mIoU (6-Fold) 75.0 # 2
mIoU (Area-5) 72.4 # 1
3D Instance Segmentation S3DIS OneFormer3D mRec 74.1 # 2
mPrec 82.3 # 1
AP@50 75.8 # 1
mAP 63.0 # 2
Panoptic Segmentation ScanNet OneFormer3D PQ 71.2 # 1
PQ_th 69.6 # 1
PQ_st 86.1 # 1
Semantic Segmentation ScanNet OneFormer3D val mIoU 76.6 # 4
3D Semantic Segmentation ScanNet200 OneFormer3D val mIoU 30.1 # 5
3D Instance Segmentation ScanNet(v2) OneFromer3D mAP 56.6 # 3
mAP @ 50 80.1 # 1
mAP@25 89.6 # 1
3D Object Detection ScanNetV2 OneFormer3D mAP@0.25 76.9 # 2
mAP@0.5 65.3 # 2
Panoptic Segmentation ScanNetV2 OneFormer3D PQ 71.2 # 1


No methods listed for this paper. Add relevant methods here