A scene graph is a structured representation of an image, where nodes in a scene graph correspond to object bounding boxes with their object categories, and edges correspond to their pairwise relationships between objects. The task of Scene Graph Generation is to generate a visually-grounded scene graph that most accurately correlates with an image.
We propose a novel scene graph generation model called Graph R-CNN, that is both effective and efficient at detecting objects and their relations in images.
Ranked #5 on
Scene Graph Generation
on Visual Genome
Today's scene graph generation (SGG) task is still far from practical, mainly due to the severe training bias, e. g., collapsing diverse "human walk on / sit on / lay on beach" into "human on beach".
Ranked #1 on
Scene Graph Generation
on Visual Genome
We propose to compose dynamic tree structures that place the objects in an image into a visual context, helping visual reasoning tasks such as scene graph generation and visual Q&A.
Ranked #4 on
Scene Graph Generation
on Visual Genome
GRAPH GENERATION SCENE GRAPH GENERATION VISUAL QUESTION ANSWERING VISUAL REASONING
Object detection, scene graph generation and region captioning, which are three scene understanding tasks at different semantic levels, are tied together: scene graphs are generated on top of objects detected in an image with their pairwise relationship predicted, while region captioning gives a language description of the objects, their attributes, relations, and other context information.
Ranked #1 on
Object Detection
on Visual Genome
GRAPH GENERATION OBJECT DETECTION SCENE GRAPH GENERATION SCENE UNDERSTANDING
Generating scene graph to describe all the relations inside an image gains increasing interests these years.
Ranked #1 on
Scene Graph Generation
on VRD
GRAPH GENERATION SCENE GRAPH GENERATION VISUAL RELATIONSHIP DETECTION
More specifically, we show that the statistical correlations between objects appearing in images and their relationships, can be explicitly represented by a structured knowledge graph, and a routing mechanism is learned to propagate messages through the graph to explore their interactions.
Machine understanding of complex images is a key goal of artificial intelligence.
SCENE GRAPH CLASSIFICATION SCENE GRAPH GENERATION STRUCTURED PREDICTION
In this work, we explicitly model the objects and their relationships using scene graphs, a visually-grounded graphical structure of an image.
We introduce dense relational captioning, a novel image captioning task which aims to generate multiple captions with respect to relational information between objects in a visual scene.
GRAPH GENERATION RELATIONAL CAPTIONING SCENE GRAPH GENERATION