Video scene graph generation
4 papers with code • 1 benchmarks • 2 datasets
Most implemented papers
Panoptic Video Scene Graph Generation
PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.
Target Adaptive Context Aggregation for Video Scene Graph Generation
Specifically, we design an efficient method for frame-level VidSGG, termed as {\em Target Adaptive Context Aggregation Network} (TRACE), with a focus on capturing spatio-temporal context information for relation recognition.
Taking A Closer Look at Visual Relation: Unbiased Video Scene Graph Generation with Decoupled Label Learning
Specifically, DLL decouples the predicate labels and adopts separate classifiers to learn actional and spatial patterns respectively.
Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation
In this work, we propose a spatial-temporal knowledge-embedded transformer (STKET) that incorporates the prior spatial-temporal knowledge into the multi-head cross-attention mechanism to learn more representative relationship representations.