Zero-Shot Learning

562 papers with code • 18 benchmarks • 29 datasets

Zero-shot learning (ZSL) is a model's ability to detect classes never seen during training. The condition is that the classes are not known during supervised learning.

Earlier work in zero-shot learning use attributes in a two-step approach to infer unknown classes. In the computer vision context, more recent advances learn mappings from image feature space to semantic space. Other approaches learn non-linear multimodal embeddings. In the modern NLP context, language models can be evaluated on downstream tasks without fine tuning.

Benchmark datasets for zero-shot learning include aPY, AwA, and CUB, among others.

( Image credit: Prototypical Networks for Few shot Learning in PyTorch )

Benchmarks

Add a Result

These leaderboards are used to track progress in Zero-Shot Learning

Dataset	Best Model	Compare
CUB-200-2011	DUET	See all
SUN Attribute	SPOT (VAEGAN)	See all
AwA2	ZSL-KG	See all
Oxford 102 Flower	SPOT	See all
VOC-MLT	CLIP(ResNet-50)	See all
COCO-MLT	ResNet-50	See all
CUB-200 - 0-Shot Learning	zsl_ADA	See all
PASCAL Context	ZS3Net	See all
iVQA	FrozenBiLM	See all
SNIPS	ZSL-KG	See all
aPY - 0-Shot	ZSL-KG	See all
LSMDC	FrozenBiLM	See all
MSRVTT-QA	HiTeA	See all
MSVD-QA	HiTeA	See all
TVQA	FrozenBiLM	See all
MIT-States	CZSL	See all
ImageNet_CN	$M^2$-Encoder	See all
How2QA	SeViLA	See all

Show all 18 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Zero-Shot Learning models and implementations

mlfoundations/open_clip

3 papers

8,429

faceonlive/ai-research

3 papers

144

alibaba/EasyNLP

2 papers

1,946

sicara/easy-few-shot-learning

2 papers

898

Datasets

Subtasks

Multi-label zero-shot learning

GZSL Video Classification

Latest papers

Most implemented Social Latest No code

Comprehensive Evaluation and Insights into the Use of Large Language Models in the Automation of Behavior-Driven Development Acceptance Test Formulation

karpurapus/bddgpt-automate-tests • 22 Mar 2024

Behavior-driven development (BDD) is an Agile testing methodology fostering collaboration among developers, QA analysts, and stakeholders.

22 Mar 2024

Paper
Code

Less but Better: Enabling Generalized Zero-shot Learning Towards Unseen Domains by Intrinsic Learning from Redundant LLM Semantics

chunhuiz/semantics-of-officehome-and-minidomainnet • 21 Mar 2024

Different from existing GZSL methods which alleviate DSP by generating features of unseen classes with semantics, CDGZSL needs to construct a common feature space across domains and acquire the corresponding intrinsic semantics shared among domains to transfer from seen to unseen domains.

21 Mar 2024

Paper
Code

RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition

liuziyu77/rar • • 20 Mar 2024

Notably, our approach demonstrates a significant improvement in performance on 5 fine-grained visual recognition benchmarks, 11 few-shot image recognition datasets, and the 2 object detection datasets under the zero-shot recognition setting.

20 Mar 2024

Paper
Code

CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation

zwq456/clip-vis • • 19 Mar 2024

Given a set of initial queries, class-agnostic mask generation employs a transformer decoder to predict query masks and corresponding object scores and mask IoU scores.

19 Mar 2024

Paper
Code

Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models

elaine-sui/tps • • 19 Mar 2024

Advancements in vision-language models (VLMs) have propelled the field of computer vision, particularly in the zero-shot learning setting.

19 Mar 2024

Paper
Code

Eye-gaze Guided Multi-modal Alignment Framework for Radiology

momarky/egma • • 19 Mar 2024

Additionally, we explore the impact of varying amounts of eye-gaze data on model performance, highlighting the feasibility and utility of integrating this auxiliary data into multi-modal pre-training.

19 Mar 2024

Paper
Code

Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters

jiazuoyu/moe-adapters4cl • • 18 Mar 2024

Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset.

18 Mar 2024

Paper
Code

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

jmiemirza/meta-prompting • • 18 Mar 2024

Prompt ensembling of Large Language Model (LLM) generated category-specific prompts has emerged as an effective method to enhance zero-shot recognition ability of Vision-Language Models (VLMs).

18 Mar 2024

Paper
Code

CoLeCLIP: Open-Domain Continual Learning via Joint Task Prompt and Vocabulary Learning

YukunLi99/CoLeCLIP • • 15 Mar 2024

Large pre-trained VLMs like CLIP have demonstrated superior zero-shot recognition ability, and a number of recent studies leverage this ability to mitigate catastrophic forgetting in CL, but they focus on closed-set CL in a single domain dataset.

15 Mar 2024

Paper
Code

OpenGraph: Open-Vocabulary Hierarchical 3D Graph Representation in Large-Scale Outdoor Environments

bit-dyn/opengraph • • 14 Mar 2024

In this work, we propose OpenGraph, the first open-vocabulary hierarchical graph representation designed for large-scale outdoor environments.

14 Mar 2024

Paper
Code

Zero-Shot Learning

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result