TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Zero-Shot Transfer 3D Point Cloud Classification	ModelNet10	PointCLIP	Accuracy (%)	30.23	# 4
Training-free 3D Point Cloud Classification	ModelNet40	PointCLIP	Accuracy (%)	20.2	# 6
Training-free 3D Point Cloud Classification	ModelNet40	PointCLIP	Need 3D Data?	No	# 1
Zero-Shot Transfer 3D Point Cloud Classification	ModelNet40	PointCLIP	Accuracy (%)	20.18	# 11
Zero-shot 3D Point Cloud Classification	ScanNetV2	PointCLIP	Top 1 Accuracy %	6.3	# 8
Zero-shot 3D Point Cloud Classification	ScanNetV2	PointCLIP w/ TP.	Top 1 Accuracy %	26.1	# 5
Training-free 3D Point Cloud Classification	ScanObjectNN	PointCLIP	Accuracy (%)	15.4	# 5
Training-free 3D Point Cloud Classification	ScanObjectNN	PointCLIP	Need 3D Data?	No	# 1
Zero-Shot Transfer 3D Point Cloud Classification	ScanObjectNN	PointCLIP	PB_T50_RS Accuracy (%)	15.38	# 4
Zero-Shot Transfer 3D Point Cloud Classification	ScanObjectNN	PointCLIP	OBJ_BG Accuracy(%)	21.34	# 4
Zero-Shot Transfer 3D Point Cloud Classification	ScanObjectNN	PointCLIP	OBJ_ONLY Accuracy(%)	19.28	# 8
Training-free 3D Part Segmentation	ShapeNet-Part	PointCLIP	mIoU	31.0	# 3
Training-free 3D Part Segmentation	ShapeNet-Part	PointCLIP	Need 3D Data?	No	# 1
3D Open-Vocabulary Instance Segmentation	STPLS3D	PointCLIP	AP50	02.6	# 3

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pointclip-point-cloud-understanding-by-clip/training-free-3d-part-segmentation-on)](https://paperswithcode.com/sota/training-free-3d-part-segmentation-on?p=pointclip-point-cloud-understanding-by-clip)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pointclip-point-cloud-understanding-by-clip/3d-open-vocabulary-instance-segmentation-on-3)](https://paperswithcode.com/sota/3d-open-vocabulary-instance-segmentation-on-3?p=pointclip-point-cloud-understanding-by-clip)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pointclip-point-cloud-understanding-by-clip/zero-shot-transfer-3d-point-cloud-1)](https://paperswithcode.com/sota/zero-shot-transfer-3d-point-cloud-1?p=pointclip-point-cloud-understanding-by-clip)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pointclip-point-cloud-understanding-by-clip/zero-shot-3d-point-cloud-classification-on-1)](https://paperswithcode.com/sota/zero-shot-3d-point-cloud-classification-on-1?p=pointclip-point-cloud-understanding-by-clip)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pointclip-point-cloud-understanding-by-clip/training-free-3d-point-cloud-classification-1)](https://paperswithcode.com/sota/training-free-3d-point-cloud-classification-1?p=pointclip-point-cloud-understanding-by-clip)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pointclip-point-cloud-understanding-by-clip/training-free-3d-point-cloud-classification)](https://paperswithcode.com/sota/training-free-3d-point-cloud-classification?p=pointclip-point-cloud-understanding-by-clip)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pointclip-point-cloud-understanding-by-clip/zero-shot-transfer-3d-point-cloud-2)](https://paperswithcode.com/sota/zero-shot-transfer-3d-point-cloud-2?p=pointclip-point-cloud-understanding-by-clip)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/pointclip-point-cloud-understanding-by-clip/zero-shot-transfer-3d-point-cloud)](https://paperswithcode.com/sota/zero-shot-transfer-3d-point-cloud?p=pointclip-point-cloud-understanding-by-clip)`

PointCLIP: Point Cloud Understanding by CLIP

CVPR 2022 · Renrui Zhang, Ziyu Guo, Wei zhang, Kunchang Li, Xupeng Miao, Bin Cui, Yu Qiao, Peng Gao, Hongsheng Li ·

Recently, zero-shot and few-shot learning via Contrastive Vision-Language Pre-training (CLIP) have shown inspirational performance on 2D visual recognition, which learns to match images with their corresponding texts in open-vocabulary settings. However, it remains under explored that whether CLIP, pre-trained by large-scale image-text pairs in 2D, can be generalized to 3D recognition. In this paper, we identify such a setting is feasible by proposing PointCLIP, which conducts alignment between CLIP-encoded point cloud and 3D category texts. Specifically, we encode a point cloud by projecting it into multi-view depth maps without rendering, and aggregate the view-wise zero-shot prediction to achieve knowledge transfer from 2D to 3D. On top of that, we design an inter-view adapter to better extract the global feature and adaptively fuse the few-shot knowledge learned from 3D into CLIP pre-trained in 2D. By just fine-tuning the lightweight adapter in the few-shot settings, the performance of PointCLIP could be largely improved. In addition, we observe the complementary property between PointCLIP and classical 3D-supervised networks. By simple ensembling, PointCLIP boosts baseline's performance and even surpasses state-of-the-art models. Therefore, PointCLIP is a promising alternative for effective 3D point cloud understanding via CLIP under low resource cost and data regime. We conduct thorough experiments on widely-adopted ModelNet10, ModelNet40 and the challenging ScanObjectNN to demonstrate the effectiveness of PointCLIP. The code is released at https://github.com/ZrrSkywalker/PointCLIP.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract

Code

Add Remove Mark official

zrrskywalker/pointclip official

291

pku-dair/hetu

231

Tasks

Add Remove

3D Open-Vocabulary Instance Segmentation

Few-Shot Learning

Open Vocabulary Object Detection

Training-free 3D Part Segmentation

Training-free 3D Point Cloud Classification

Transfer Learning

Zero-shot 3D Point Cloud Classification

Zero-Shot Transfer 3D Point Cloud Classification

Datasets

ImageNet

ShapeNet

ScanNet

ModelNet

ScanObjectNN

STPLS3D

Results from the Paper

Edit

Ranked #3 on 3D Open-Vocabulary Instance Segmentation on STPLS3D

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Zero-Shot Transfer 3D Point Cloud Classification	ModelNet10	PointCLIP	Accuracy (%)	30.23	# 4	Compare
Training-free 3D Point Cloud Classification	ModelNet40	PointCLIP	Accuracy (%)	20.2	# 6	Compare
Training-free 3D Point Cloud Classification	ModelNet40	PointCLIP	Need 3D Data?	No	# 1	Compare
Zero-Shot Transfer 3D Point Cloud Classification	ModelNet40	PointCLIP	Accuracy (%)	20.18	# 11	Compare
Zero-shot 3D Point Cloud Classification	ScanNetV2	PointCLIP	Top 1 Accuracy %	6.3	# 8	Compare
Zero-shot 3D Point Cloud Classification	ScanNetV2	PointCLIP w/ TP.	Top 1 Accuracy %	26.1	# 5	Compare
Training-free 3D Point Cloud Classification	ScanObjectNN	PointCLIP	Accuracy (%)	15.4	# 5	Compare
Training-free 3D Point Cloud Classification	ScanObjectNN	PointCLIP	Need 3D Data?	No	# 1	Compare
Zero-Shot Transfer 3D Point Cloud Classification	ScanObjectNN	PointCLIP	PB_T50_RS Accuracy (%)	15.38	# 4	Compare
			OBJ_BG Accuracy(%)	21.34	# 4	Compare
			OBJ_ONLY Accuracy(%)	19.28	# 8	Compare
Training-free 3D Part Segmentation	ShapeNet-Part	PointCLIP	mIoU	31.0	# 3	Compare
Training-free 3D Part Segmentation	ShapeNet-Part	PointCLIP	Need 3D Data?	No	# 1	Compare
3D Open-Vocabulary Instance Segmentation	STPLS3D	PointCLIP	AP50	02.6	# 3	Compare

Methods

Add Remove

Adapter • CLIP

Edit Social Preview

PointCLIP: Point Cloud Understanding by CLIP

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove