TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
3D Point Cloud Linear Classification	ModelNet40	I2P-MAE	Overall Accuracy	93.4	# 2
Few-Shot 3D Point Cloud Classification	ModelNet40 10-way (10-shot)	I2P-MAE	Overall Accuracy	92.6	# 9
Few-Shot 3D Point Cloud Classification	ModelNet40 10-way (10-shot)	I2P-MAE	Standard Deviation	5.0	# 19
Few-Shot 3D Point Cloud Classification	ModelNet40 10-way (20-shot)	I2P-MAE	Overall Accuracy	95.5	# 8
Few-Shot 3D Point Cloud Classification	ModelNet40 10-way (20-shot)	I2P-MAE	Standard Deviation	3.0	# 10
Few-Shot 3D Point Cloud Classification	ModelNet40 5-way (10-shot)	I2P-MAE	Overall Accuracy	97.0	# 7
Few-Shot 3D Point Cloud Classification	ModelNet40 5-way (10-shot)	I2P-MAE	Standard Deviation	1.8	# 4
Few-Shot 3D Point Cloud Classification	ModelNet40 5-way (20-shot)	I2P-MAE	Overall Accuracy	98.3	# 6
Few-Shot 3D Point Cloud Classification	ModelNet40 5-way (20-shot)	I2P-MAE	Standard Deviation	1.3	# 6
3D Point Cloud Classification	ScanObjectNN	I2P-MAE (no voting)	Overall Accuracy	90.11	# 12
3D Point Cloud Classification	ScanObjectNN	I2P-MAE (no voting)	OBJ-BG (OA)	94.15	# 7
3D Point Cloud Classification	ScanObjectNN	I2P-MAE (no voting)	OBJ-ONLY (OA)	91.57	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-3d-representations-from-2d-pre/3d-point-cloud-linear-classification-on)](https://paperswithcode.com/sota/3d-point-cloud-linear-classification-on?p=learning-3d-representations-from-2d-pre)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-3d-representations-from-2d-pre/few-shot-3d-point-cloud-classification-on-2)](https://paperswithcode.com/sota/few-shot-3d-point-cloud-classification-on-2?p=learning-3d-representations-from-2d-pre)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-3d-representations-from-2d-pre/few-shot-3d-point-cloud-classification-on-1)](https://paperswithcode.com/sota/few-shot-3d-point-cloud-classification-on-1?p=learning-3d-representations-from-2d-pre)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-3d-representations-from-2d-pre/few-shot-3d-point-cloud-classification-on-4)](https://paperswithcode.com/sota/few-shot-3d-point-cloud-classification-on-4?p=learning-3d-representations-from-2d-pre)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-3d-representations-from-2d-pre/few-shot-3d-point-cloud-classification-on-3)](https://paperswithcode.com/sota/few-shot-3d-point-cloud-classification-on-3?p=learning-3d-representations-from-2d-pre)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-3d-representations-from-2d-pre/3d-point-cloud-classification-on-scanobjectnn)](https://paperswithcode.com/sota/3d-point-cloud-classification-on-scanobjectnn?p=learning-3d-representations-from-2d-pre)`

Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders

CVPR 2023 · Renrui Zhang, Liuhui Wang, Yu Qiao, Peng Gao, Hongsheng Li ·

Pre-training by numerous image data has become de-facto for robust 2D representations. In contrast, due to the expensive data acquisition and annotation, a paucity of large-scale 3D datasets severely hinders the learning for high-quality 3D features. In this paper, we propose an alternative to obtain superior 3D representations from 2D pre-trained models via Image-to-Point Masked Autoencoders, named as I2P-MAE. By self-supervised pre-training, we leverage the well learned 2D knowledge to guide 3D masked autoencoding, which reconstructs the masked point tokens with an encoder-decoder architecture. Specifically, we first utilize off-the-shelf 2D models to extract the multi-view visual features of the input point cloud, and then conduct two types of image-to-point learning schemes on top. For one, we introduce a 2D-guided masking strategy that maintains semantically important point tokens to be visible for the encoder. Compared to random masking, the network can better concentrate on significant 3D structures and recover the masked tokens from key spatial cues. For another, we enforce these visible tokens to reconstruct the corresponding multi-view 2D features after the decoder. This enables the network to effectively inherit high-level 2D semantics learned from rich image data for discriminative 3D modeling. Aided by our image-to-point pre-training, the frozen I2P-MAE, without any fine-tuning, achieves 93.4% accuracy for linear SVM on ModelNet40, competitive to the fully trained results of existing methods. By further fine-tuning on on ScanObjectNN's hardest split, I2P-MAE attains the state-of-the-art 90.11% accuracy, +3.68% to the second-best, demonstrating superior transferable capacity. Code will be available at https://github.com/ZrrSkywalker/I2P-MAE.

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Code

Add Remove Mark official

zrrskywalker/i2p-mae official

197

zrrskywalker/point-m2ae

184

Tasks

Add Remove

3D Point Cloud Classification

3D Point Cloud Linear Classification

Few-Shot 3D Point Cloud Classification

Datasets

ShapeNet

ModelNet

ScanObjectNN

Results from the Paper

Edit

Ranked #2 on 3D Point Cloud Linear Classification on ModelNet40 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
3D Point Cloud Linear Classification	ModelNet40	I2P-MAE	Overall Accuracy	93.4	# 2	Compare
Few-Shot 3D Point Cloud Classification	ModelNet40 10-way (10-shot)	I2P-MAE	Overall Accuracy	92.6	# 9	Compare
Few-Shot 3D Point Cloud Classification	ModelNet40 10-way (10-shot)	I2P-MAE	Standard Deviation	5.0	# 19	Compare
Few-Shot 3D Point Cloud Classification	ModelNet40 10-way (20-shot)	I2P-MAE	Overall Accuracy	95.5	# 8	Compare
Few-Shot 3D Point Cloud Classification	ModelNet40 10-way (20-shot)	I2P-MAE	Standard Deviation	3.0	# 10	Compare
Few-Shot 3D Point Cloud Classification	ModelNet40 5-way (10-shot)	I2P-MAE	Overall Accuracy	97.0	# 7	Compare
Few-Shot 3D Point Cloud Classification	ModelNet40 5-way (10-shot)	I2P-MAE	Standard Deviation	1.8	# 4	Compare
Few-Shot 3D Point Cloud Classification	ModelNet40 5-way (20-shot)	I2P-MAE	Overall Accuracy	98.3	# 6	Compare
Few-Shot 3D Point Cloud Classification	ModelNet40 5-way (20-shot)	I2P-MAE	Standard Deviation	1.3	# 6	Compare
3D Point Cloud Classification	ScanObjectNN	I2P-MAE (no voting)	Overall Accuracy	90.11	# 12	Compare
			OBJ-BG (OA)	94.15	# 7	Compare
			OBJ-ONLY (OA)	91.57	# 9	Compare

Methods

Add Remove

SVM

Edit Social Preview

Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove