The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.
10,184 PAPERS • 93 BENCHMARKS
The Flickr30k dataset contains 31,000 images collected from Flickr, together with 5 reference sentences provided by human annotators.
735 PAPERS • 9 BENCHMARKS
A Zero-Shot Sketch-based Inter-Modal Object Retrieval Scheme for Remote Sensing Images
1 PAPER • NO BENCHMARKS YET