1 code implementation • 11 May 2024 • Jinglin Xu, Sibo Yin, Guohao Zhao, Zishuo Wang, Yuxin Peng
We argue that a fine-grained understanding of actions requires the model to perceive and parse actions in both time and space, which is also the key to the credibility and interpretability of the AQA technique.