3 dataset results for Semantic Image-Text Similarity AND Images

LAION-400M is a dataset with CLIP-filtered 400 million image-text pairs, their CLIP embeddings and kNN indices that allow efficient similarity search.

133 PAPERS • 1 BENCHMARK

CxC (Crisscrossed Captions)

Crisscrossed Captions (CxC) contains 247,315 human-labeled annotations including positive and negative associations between image pairs, caption pairs and image-caption pairs.

21 PAPERS • 3 BENCHMARKS

RGZ EMU: Semantic Taxonomy

RGZ EMU: Semantic Taxonomy (Radio Galaxy Zoo EMU: Towards a Semantic Radio Galaxy Morphology Taxonomy)

The data used in - "Radio Galaxy Zoo EMU: Towards a Semantic Radio Galaxy Morphology Taxonomy" (Bowles et al. submitted) - "A New Task: Deriving Semantic Class Targets for the Physical Sciences" (Bowles et al. 2022: https://arxiv.org/abs/2210.14760) accepted at the Fifth Workshop on Machine Learning and the Physical Sciences, Neural Information Processing Systems 2022.

1 PAPER • NO BENCHMARKS YET

Datasets

3 dataset results for Semantic Image-Text Similarity AND Images