Semantic Image-Text Similarity
1 papers with code • 1 benchmarks • 3 datasets
This task has no description! Would you like to contribute one?
Most implemented papers
Learning the Best Pooling Strategy for Visual Semantic Embedding
Visual Semantic Embedding (VSE) is a dominant approach for vision-language retrieval, which aims at learning a deep embedding space such that visual data are embedded close to their semantic text labels or descriptions.