Scene Graph Generation
130 papers with code • 5 benchmarks • 7 datasets
A scene graph is a structured representation of an image, where nodes in a scene graph correspond to object bounding boxes with their object categories, and edges correspond to their pairwise relationships between objects. The task of Scene Graph Generation is to generate a visually-grounded scene graph that most accurately correlates with an image.
Libraries
Use these libraries to find Scene Graph Generation models and implementationsLatest papers with no code
CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos
In this paper, we introduce the new AeroEye dataset that focuses on multi-object relationship modeling in aerial videos.
Scene Graph Generation Strategy with Co-occurrence Knowledge and Learnable Term Frequency
Scene graph generation (SGG) is an important task in image understanding because it represents the relationships between objects in an image as a graph structure, making it possible to understand the semantic relationships between objects intuitively.
A Review and Efficient Implementation of Scene Graph Generation Metrics
Scene graph generation has emerged as a prominent research field in computer vision, witnessing significant advancements in the recent years.
Tri-modal Confluence with Temporal Dynamics for Scene Graph Generation in Operating Rooms
A comprehensive understanding of surgical scenes allows for monitoring of the surgical process, reducing the occurrence of accidents and enhancing efficiency for medical professionals.
AUG: A New Dataset and An Efficient Model for Aerial Image Urban Scene Graph Generation
To fill in the gap of the overhead view dataset, this paper constructs and releases an aerial image urban scene graph generation (AUG) dataset.
Weakly-Supervised 3D Scene Graph Generation via Visual-Linguistic Assisted Pseudo-labeling
However, previous 3D scene graph generation methods utilize a fully supervised learning manner and require a large amount of entity-level annotation data of objects and relations, which is extremely resource-consuming and tedious to obtain.
Improving Scene Graph Generation with Relation Words' Debiasing in Vision-Language Models
After that, we ensemble VLMs with SGG models to enhance representation.
R3CD: Scene Graph to Image Generation with Relation-aware Compositional Contrastive Control Diffusion
Image generation tasks have achieved remarkable performance using large-scale diffusion models.
Mapping High-level Semantic Regions in Indoor Environments without Object Recognition
Robots require a semantic understanding of their surroundings to operate in an efficient and explainable way in human environments.
Towards Lifelong Scene Graph Generation with Knowledge-ware In-context Prompt Learning
Besides, extensive experiments on the two mainstream benchmark datasets, VG and Open-Image(v6), show the superiority of our proposed model to a number of competitive SGG models in terms of continuous learning and conventional settings.