Paper

Learning To Generate Scene Graph from Head to Tail

Scene Graph Generation (SGG) represents objects and their interactions with a graph structure. Recently, many works are devoted to solving the imbalanced problem in SGG. However, underestimating the head predicates in the whole training process, they wreck the features of head predicates that provide general features for tail ones. Besides, assigning excessive attention to the tail predicates leads to semantic deviation. Based on this, we propose a novel SGG framework, learning to generate scene graphs from Head to Tail (SGG-HT), containing Curriculum Re-weight Mechanism (CRM) and Semantic Context Module (SCM). CRM learns head/easy samples firstly for robust features of head predicates and then gradually focuses on tail/hard ones. SCM is proposed to relieve semantic deviation by ensuring the semantic consistency between the generated scene graph and the ground truth in global and local representations. Experiments show that SGG-HT significantly alleviates the biased problem and chieves state-of-the-art performances on Visual Genome.

Results in Papers With Code
(↓ scroll down to see all results)