Text-to-Image Retrieval

2 papers with code • 3 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Datasets


Greatest papers with code

ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision

dandelin/vilt 5 Feb 2021

Vision-and-Language Pre-training (VLP) has improved performance on various joint vision-and-language downstream tasks.

Image-to-Text Retrieval Text-to-Image Retrieval +2

ZSCRGAN: A GAN-based Expectation Maximization Model for Zero-Shot Retrieval of Images from Textual Descriptions

ranarag/ZSCRGAN 23 Jul 2020

Most existing algorithms for cross-modal Information Retrieval are based on a supervised train-test setup, where a model learns to align the mode of the query (e. g., text) to the mode of the documents (e. g., images) from a given training set.

Cross-Modal Information Retrieval Image Retrieval +3