TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Compositional Zero-Shot Learning	MIT-States, generalized split	CAILA	H-Mean	39.9	# 1
Compositional Zero-Shot Learning	MIT-States, generalized split	CAILA	Seen accuracy	51.0	# 1
Compositional Zero-Shot Learning	MIT-States, generalized split	CAILA	Test AUC top 1	23.4	# 1
Compositional Zero-Shot Learning	MIT-States, generalized split	CAILA	Test AUC top 2	-	# 2
Compositional Zero-Shot Learning	MIT-States, generalized split	CAILA	Test AUC top 3	-	# 2
Compositional Zero-Shot Learning	MIT-States, generalized split	CAILA	Unseen accuracy	53.9	# 1
Compositional Zero-Shot Learning	MIT-States, generalized split	CAILA	Val AUC top 1	-	# 2
Compositional Zero-Shot Learning	MIT-States, generalized split	CAILA	Val AUC top 2	-	# 2
Compositional Zero-Shot Learning	MIT-States, generalized split	CAILA	Val AUC top 3	-	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/caila-concept-aware-intra-layer-adapters-for/compositional-zero-shot-learning-on-mit-3)](https://paperswithcode.com/sota/compositional-zero-shot-learning-on-mit-3?p=caila-concept-aware-intra-layer-adapters-for)`

CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning

26 May 2023 · Zhaoheng Zheng, Haidong Zhu, Ram Nevatia ·

In this paper, we study the problem of Compositional Zero-Shot Learning (CZSL), which is to recognize novel attribute-object combinations with pre-existing concepts. Recent researchers focus on applying large-scale Vision-Language Pre-trained (VLP) models like CLIP with strong generalization ability. However, these methods treat the pre-trained model as a black box and focus on pre- and post-CLIP operations, which do not inherently mine the semantic concept between the layers inside CLIP. We propose to dive deep into the architecture and insert adapters, a parameter-efficient technique proven to be effective among large language models, into each CLIP encoder layer. We further equip adapters with concept awareness so that concept-specific features of "object", "attribute", and "composition" can be extracted. We assess our method on four popular CZSL datasets, MIT-States, C-GQA, UT-Zappos, and VAW-CZSL, which shows state-of-the-art performance compared to existing methods on all of them.

PDF Abstract

Code

Add Remove Mark official

zhaohengz/caila official

Tasks

Add Remove

Attribute

Compositional Zero-Shot Learning

Zero-Shot Learning

Datasets

MIT-States C-GQA

Results from the Paper

Edit

Ranked #1 on Compositional Zero-Shot Learning on MIT-States, generalized split

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Compositional Zero-Shot Learning	MIT-States, generalized split	CAILA	H-Mean	39.9	# 1	Compare
			Seen accuracy	51.0	# 1	Compare
			Test AUC top 1	23.4	# 1	Compare
			Test AUC top 2	-	# 2	Compare
			Test AUC top 3	-	# 2	Compare
			Unseen accuracy	53.9	# 1	Compare
			Val AUC top 1	-	# 2	Compare
			Val AUC top 2	-	# 2	Compare
			Val AUC top 3	-	# 2	Compare

Methods

Add Remove

CLIP • Focus

Edit Social Preview

CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove