AAAI-2020 2020

Learning Cross-modal Context Graph for Visual Grounding

AAAI-2020 2020 youngfly11/LCMCG-PyTorch

To address their limitations, this paper proposes a language-guided graph representation to capture the global context of grounding entities and their relations, and develop a cross-modal graph matching strategy for the multiple-phrase visual grounding task.

GRAPH MATCHING LANGUAGE MODELLING NATURAL LANGUAGE VISUAL GROUNDING PHRASE GROUNDING