no code implementations • 28 Feb 2023 • Xianglong Lang, Zhuming Wang, Zun Li, Meng Tian, Ge Shi, Lifang Wu, Liang Wang
Specifically, the framework consists of a Visual Representation Module to extract individual appearance features, a Knowledge Augmented Semantic Relation Module explore semantic representations of individual actions, and a Knowledge-Semantic-Visual Interaction Module aims to integrate visual and semantic information by the knowledge.