V-COCO (Verbs in COCO)

Introduced by Gupta et al. in Visual Semantic Role Labeling

Verbs in COCO (V-COCO) is a dataset that builds off COCO for human-object interaction detection. V-COCO provides 10,346 images (2,533 for training, 2,867 for validating and 4,946 for testing) and 16,199 person instances. Each person has annotations for 29 action categories and there are no interaction labels including objects.

Source: Visual Compositional Learning for Human-Object Interaction Detection


Paper Code Results Date Stars

Dataset Loaders


Similar Datasets