Social Fabric: Tubelet Compositions for Video Relation Detection

ICCV 2021  ยท  Shuo Chen, Zenglin Shi, Pascal Mettes, Cees G. M. Snoek ยท

This paper strives to classify and detect the relationship between object tubelets appearing within a video as a <subject-predicate-object> triplet. Where existing works treat object proposals or tubelets as single entities and model their relations a posteriori, we propose to classify and detect predicates for pairs of object tubelets a priori. We also propose Social Fabric: an encoding that represents a pair of object tubelets as a composition of interaction primitives. These primitives are learned over all relations, resulting in a compact representation able to localize and classify relations from the pool of co-occurring object tubelets across all timespans in a video. The encoding enables our two-stage network. In the first stage, we train Social Fabric to suggest proposals that are likely interacting. We use the Social Fabric in the second stage to simultaneously fine-tune and predict predicate labels for the tubelets. Experiments demonstrate the benefit of early video relation modeling, our encoding and the two-stage architecture, leading to a new state-of-the-art on two benchmarks. We also show how the encoding enables query-by-primitive-example to search for spatio-temporal video relations. Code: https://github.com/shanshuo/Social-Fabric.

PDF Abstract ICCV 2021 PDF ICCV 2021 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Video Visual Relation Detection ImageNet-VidVRD Social Fabric Recall@50 13.73 # 1
Recall@100 16.88 # 1
mAP 20.08 # 1
Video Visual Relation Tagging ImageNet-VidVRD Social Fabric Precision@1 62.5 # 1
Precision@5 49.2 # 1
Precision@10 38.45 # 1
Video Visual Relation Detection VidOR Social Fabric Recall@50 9.99 # 1
Recall@100 11.94 # 1
mAP 11.21 # 1
Video Visual Relation Tagging VidOR Social Fabric Precision@1 68.86 # 1
Precision@5 55.16 # 1

Methods


No methods listed for this paper. Add relevant methods here