TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Training-free 3D Point Cloud Classification	ModelNet40	CALIP	Accuracy (%)	21.5	# 5
Training-free 3D Point Cloud Classification	ModelNet40	CALIP	Need 3D Data?	No	# 1
Training-free 3D Point Cloud Classification	ScanObjectNN	CALIP	Accuracy (%)	16.9	# 4
Training-free 3D Point Cloud Classification	ScanObjectNN	CALIP	Need 3D Data?	No	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/calip-zero-shot-enhancement-of-clip-with/training-free-3d-point-cloud-classification-1)](https://paperswithcode.com/sota/training-free-3d-point-cloud-classification-1?p=calip-zero-shot-enhancement-of-clip-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/calip-zero-shot-enhancement-of-clip-with/training-free-3d-point-cloud-classification)](https://paperswithcode.com/sota/training-free-3d-point-cloud-classification?p=calip-zero-shot-enhancement-of-clip-with)`

CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention

28 Sep 2022 · Ziyu Guo, Renrui Zhang, Longtian Qiu, Xianzheng Ma, Xupeng Miao, Xuming He, Bin Cui ·

Contrastive Language-Image Pre-training (CLIP) has been shown to learn visual representations with great transferability, which achieves promising accuracy for zero-shot classification. To further improve its downstream performance, existing works propose additional learnable modules upon CLIP and fine-tune them by few-shot training sets. However, the resulting extra training cost and data requirement severely hinder the efficiency for model deployment and knowledge transfer. In this paper, we introduce a free-lunch enhancement method, CALIP, to boost CLIP's zero-shot performance via a parameter-free Attention module. Specifically, we guide visual and textual representations to interact with each other and explore cross-modal informative features via attention. As the pre-training has largely reduced the embedding distances between two modalities, we discard all learnable parameters in the attention and bidirectionally update the multi-modal features, enabling the whole process to be parameter-free and training-free. In this way, the images are blended with textual-aware signals and the text representations become visual-guided for better adaptive zero-shot alignment. We evaluate CALIP on various benchmarks of 14 datasets for both 2D image and 3D point cloud few-shot classification, showing consistent zero-shot performance improvement over CLIP. Based on that, we further insert a small number of linear layers in CALIP's attention module and verify our robustness under the few-shot settings, which also achieves leading performance compared to existing methods. Those extensive experiments demonstrate the superiority of our approach for efficient enhancement of CLIP.

PDF Abstract

Code

Add Remove Mark official

ziyuguo99/calip official

Tasks

Add Remove

Training-free 3D Point Cloud Classification

Transfer Learning

Zero-Shot Learning

Datasets

ImageNet

UCF101

ModelNet

Oxford 102 Flower

Stanford Cars

DTD

Food-101

Caltech-101

EuroSAT

FGVC-Aircraft

ScanObjectNN

Results from the Paper

Edit

Ranked #4 on Training-free 3D Point Cloud Classification on ScanObjectNN (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Training-free 3D Point Cloud Classification	ModelNet40	CALIP	Accuracy (%)	21.5	# 5	Compare
Training-free 3D Point Cloud Classification	ModelNet40	CALIP	Need 3D Data?	No	# 1	Compare
Training-free 3D Point Cloud Classification	ScanObjectNN	CALIP	Accuracy (%)	16.9	# 4	Compare
Training-free 3D Point Cloud Classification	ScanObjectNN	CALIP	Need 3D Data?	No	# 1	Compare

Methods

Add Remove

CLIP

Edit Social Preview

CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove