Zero-Shot Learning

561 papers with code • 18 benchmarks • 29 datasets

Zero-shot learning (ZSL) is a model's ability to detect classes never seen during training. The condition is that the classes are not known during supervised learning.

Earlier work in zero-shot learning use attributes in a two-step approach to infer unknown classes. In the computer vision context, more recent advances learn mappings from image feature space to semantic space. Other approaches learn non-linear multimodal embeddings. In the modern NLP context, language models can be evaluated on downstream tasks without fine tuning.

Benchmark datasets for zero-shot learning include aPY, AwA, and CUB, among others.

( Image credit: Prototypical Networks for Few shot Learning in PyTorch )

Benchmarks

Add a Result

These leaderboards are used to track progress in Zero-Shot Learning

Dataset	Best Model	Compare
CUB-200-2011	DUET	See all
SUN Attribute	SPOT (VAEGAN)	See all
AwA2	ZSL-KG	See all
Oxford 102 Flower	SPOT	See all
VOC-MLT	CLIP(ResNet-50)	See all
COCO-MLT	ResNet-50	See all
CUB-200 - 0-Shot Learning	zsl_ADA	See all
PASCAL Context	ZS3Net	See all
iVQA	FrozenBiLM	See all
SNIPS	ZSL-KG	See all
aPY - 0-Shot	ZSL-KG	See all
LSMDC	FrozenBiLM	See all
MSRVTT-QA	HiTeA	See all
MSVD-QA	HiTeA	See all
TVQA	FrozenBiLM	See all
MIT-States	CZSL	See all
ImageNet_CN	$M^2$-Encoder	See all
How2QA	SeViLA	See all

Show all 18 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Zero-Shot Learning models and implementations

mlfoundations/open_clip

3 papers

8,415

faceonlive/ai-research

3 papers

140

alibaba/EasyNLP

2 papers

1,946

sicara/easy-few-shot-learning

2 papers

898

Datasets

Subtasks

Multi-label zero-shot learning

GZSL Video Classification

Latest papers with no code

Most implemented Social Latest No code

Bootstrapping Chest CT Image Understanding by Distilling Knowledge from X-ray Expert Models

no code yet • 7 Apr 2024

In this paper, we explore the feasibility of leveraging language as a naturally high-quality supervision for chest CT imaging.

Paper
Add Code

Towards Large Language Model driven Reference-less Translation Evaluation for English and Indian Languages

no code yet • 3 Apr 2024

We constructed a translation evaluation task where we performed zero-shot learning, in-context example-driven learning, and fine-tuning of large language models to provide a score out of 100, where 100 represents a perfect translation and 1 represents a poor translation.

Paper
Add Code

Diffusion based Zero-shot Medical Image-to-Image Translation for Cross Modality Segmentation

no code yet • 1 Apr 2024

To leverage generative learning for zero-shot cross-modality image segmentation, we propose a novel unsupervised image translation method.

Paper
Add Code

Training-Free Semantic Segmentation via LLM-Supervision

no code yet • 31 Mar 2024

Additionally, we propose an assembly that merges the segmentation maps from the various subclass descriptors to ensure a more comprehensive representation of the different aspects in the test images.

Paper
Add Code

VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation

no code yet • 25 Mar 2024

In this work, we introduce a novel Visual Prompt-guided text-to-3D diffusion model (VP3D) that explicitly unleashes the visual appearance knowledge in 2D visual prompt to boost text-to-3D generation.

Paper
Add Code

HierCode: A Lightweight Hierarchical Codebook for Zero-shot Chinese Text Recognition

no code yet • 20 Mar 2024

Text recognition, especially for complex scripts like Chinese, faces unique challenges due to its intricate character structures and vast vocabulary.

Paper
Add Code

MEDBind: Unifying Language and Multimodal Medical Data Embeddings

no code yet • 19 Mar 2024

Medical vision-language pretraining models (VLPM) have achieved remarkable progress in fusing chest X-rays (CXR) with clinical texts, introducing image-text data binding approaches that enable zero-shot learning and downstream clinical tasks.

Paper
Add Code

Audio-Visual Compound Expression Recognition Method based on Late Modality Fusion and Rule-based Decision

no code yet • 19 Mar 2024

Our findings from the challenge demonstrate that the proposed method can potentially form a basis for developing intelligent tools for annotating audio-visual data in the context of human's basic and compound emotions.

Paper
Add Code

UniBind: LLM-Augmented Unified and Balanced Representation Space to Bind Them All

no code yet • 19 Mar 2024

To make this possible, we 1) construct a knowledge base of text embeddings with the help of LLMs and multi-modal LLMs; 2) adaptively build LLM-augmented class-wise embedding center on top of the knowledge base and encoded visual embeddings; 3) align all the embeddings to the LLM-augmented embedding center via contrastive learning to achieve a unified and balanced representation space.

Paper
Add Code

Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition

no code yet • 19 Mar 2024

Open-domain real-world entity recognition is essential yet challenging, involving identifying various entities in diverse environments.

Paper
Add Code

Zero-Shot Learning

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result