2 dataset results for Zero-Shot Composed Image Retrieval (ZS-CIR) AND Videos

MS COCO (Microsoft Common Objects in Context)

The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

10,222 PAPERS • 93 BENCHMARKS

WebVid-CoVR

The WebVid-CoVR dataset is a collection of video-text-video triplets that can be used for the task of composed video retrieval (CoVR). CoVR is a task that involves searching for videos that match both a query image and a query text. The text typically specifies the desired modification to the query image.

2 PAPERS • 1 BENCHMARK

Datasets

2 dataset results for Zero-Shot Composed Image Retrieval (ZS-CIR) AND Videos