SpatialVOC2K: A Multilingual Dataset of Images with Annotations and Features for Spatial Relations between Objects

WS 2018 Anja BelzAdrian MuscatPierre AnguillMouhamadou SowGa{\'e}tan VincentYassine Zinessabah

We present SpatialVOC2K, the first multilingual image dataset with spatial relation annotations and object features for image-to-text generation, built using 2,026 images from the PASCAL VOC2008 dataset. The dataset incorporates (i) the labelled object bounding boxes from VOC2008, (ii) geometrical, language and depth features for each object, and (iii) for each pair of objects in both orders, (a) the single best preposition and (b) the set of possible prepositions in the given language that describe the spatial relationship between the two objects... (read more)

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.