no code implementations • CVPR 2022 • Xun Jiang, Xing Xu, Jingran Zhang, Fumin Shen, Zuo Cao, Heng Tao Shen
Video events grounding aims at retrieving the most relevant moments from an untrimmed video in terms of a given natural language query.
no code implementations • 27 Aug 2019 • Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen
It extracts this complementary information of different modality from a connection block, which aims at exploring correlations of different stream features.
Ranked #15 on Action Recognition on HMDB-51 (using extra training data)
no code implementations • 27 Aug 2019 • Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen
In this paper, we propose an efficient temporal reasoning graph (TRG) to simultaneously capture the appearance features and temporal relation between video sequences at multiple time scales.
Ranked #53 on Action Recognition on Something-Something V1