Visual Relationship Detection
31 papers with code • 5 benchmarks • 5 datasets
Visual relationship detection (VRD) is one newly developed computer vision task aiming to recognize relations or interactions between objects in an image. It is a further learning task after object recognition and is essential for fully understanding images, even the visual world.
These leaderboards are used to track progress in Visual Relationship Detection
Most implemented papers
Graphical Contrastive Losses for Scene Graph Parsing
The first, Entity Instance Confusion, occurs when the model confuses multiple instances of the same type of entity (e. g. multiple cups).
Exploring Long Tail Visual Relationship Recognition with Large Vocabulary
We use these benchmarks to study the performance of several state-of-the-art long-tail models on the LTVRR setup.
Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation
This requires the detection of visual relationships: triples (subject, relation, object) describing a semantic relation between a subject and an object.
Representing Prior Knowledge Using Randomly, Weighted Feature Networks for Visual Relationship Detection
Furthermore, background knowledge represented by RWFNs can be used to alleviate the incompleteness of training sets even though the space complexity of RWFNs is much smaller than LTNs (1:27 ratio).
Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues
This paper presents a framework for localization or grounding of phrases in images using a large collection of linguistic and visual cues.
Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection
To capture such global interdependency, we propose a deep Variation-structured Reinforcement Learning (VRL) framework to sequentially discover object relationships and attributes in the whole image.
Towards Context-Aware Interaction Recognition for Visual Relationship Detection
The proposed method still builds one classifier for one interaction (as per type (ii) above), but the classifier built is adaptive to context via weights which are context dependent.
Visual relationship detection with deep structural ranking
In this paper, we propose a novel framework, called Deep Structural Ranking, for visual relationship detection.
Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation
Generating scene graph to describe all the relations inside an image gains increasing interests these years.
The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale
We present Open Images V4, a dataset of 9. 2M images with unified annotations for image classification, object detection and visual relationship detection.