Referring Self-supervised Learning on 3D Point Cloud

29 Sep 2021  ·  Runnan Chen, Xinge Zhu, Nenglun Chen, Dawei Wang, Wei Li, Yuexin Ma, Ruigang Yang, Wenping Wang ·

After observing a type of object, we humans could easily recognize similar objects on an unseen scene. However, such generalization ability for the neural network remains not fully explored in current researches. In this paper, we study a new problem named Referring Self-supervised Learning (RSL) on 3D scene understanding: Given the 3D synthetic models with labels and the unlabeled 3D real scene scans, our goal is to distinguish the identical semantic objects on an unseen scene according to the referring synthetic 3D models. Unlike current tasks, the purpose of RSL is to study how to transfer the neural network's knowledge from the 3D models to unseen 3D scenes, where the main challenge is solving the cross-scene -domain and -task gap between the referring synthetic model and real unseen scene. To this end, we propose a simple yet effective self-supervised framework to perform two alignment operations. First, physical alignment aims to make the referring models match the scene with data processing techniques, and then convex-hull regularized feature alignment introduces learnable prototypes to project the point features of referring models to a convex hull space, where the feature acts as a convex combination of the learned prototypes (for both referring model and real scene) and this regularization eases the alignment. Experiments show that our method achieves the average mAP of 55.32% on the ScanNet dataset by referring only to the synthetic models from the ModelNet dataset. Furthermore, it can be regarded as a pretext task to improve the performance of the downstream tasks in 3D scene understanding.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here