VRD (Visual Relationship Detection dataset)

Introduced by Lu et al. in Visual Relationship Detection with Language Priors

The Visual Relationship Dataset (VRD) contains 4000 images for training and 1000 for testing annotated with visual relationships. Bounding boxes are annotated with a label containing 100 unary predicates. These labels refer to animals, vehicles, clothes and generic objects. Pairs of bounding boxes are annotated with a label containing 70 binary predicates. These labels refer to actions, prepositions, spatial relations, comparatives or preposition phrases. The dataset has 37993 instances of visual relationships and 6672 types of relationships. 1877 instances of relationships occur only in the test set and they are used to evaluate the zero-shot learning scenario.

Source: Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Visual Relationship Detection	VRD Relationship Detection	Yu et. al [[Yu et al.2017a]]
Visual Relationship Detection	VRD Predicate Detection	Yu et. al [[Yu et al.2017a]]
Visual Relationship Detection	VRD Phrase Detection	Yu et. al [[Yu et al.2017a]]
Scene Graph Generation	VRD	FactorizableNet
Scene Graph Detection	VRD	LimLabel
Visual Relationship Detection	VRD	Ours - v
Relationship Detection	VRD	Ours - l