image-sentence alignment

2 papers with code • 12 benchmarks • 1 datasets

Predict the alignment (score) between an image and a sentence.


Most implemented papers

Removing Word-Level Spurious Alignment between Images and Pseudo-Captions in Unsupervised Image Captioning

ukyh/RemovingSpuriousAlignment EACL 2021

Unsupervised image captioning is a challenging task that aims at generating captions without the supervision of image-sentence pairs, but only with images and sentences drawn from different sources and object labels detected from the images.

VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

heidelberg-nlp/valse ACL 2022

We propose VALSE (Vision And Language Structured Evaluation), a novel benchmark designed for testing general-purpose pretrained vision and language (V&L) models for their visio-linguistic grounding capabilities on specific linguistic phenomena.