no code implementations • 16 Oct 2024 • Aozhu Chen, Hazel Doughty, Xirong Li, Cees G. M. Snoek
We perform comprehensive experiments using four state-of-the-art models across two standard benchmarks (MSR-VTT and VATEX) and two specially curated datasets enriched with detailed descriptions (VLN-UVO and VLN-OOPS), resulting in a number of novel insights: 1) our analyses show that the current evaluation benchmarks fall short in detecting a model's ability to perceive subtle single-word differences, 2) our fine-grained evaluation highlights the difficulty models face in distinguishing such subtle variations.
no code implementations • 23 Aug 2024 • Jingyu Liu, Minquan Wang, Ye Ma, Bo wang, Aozhu Chen, Quan Chen, Peng Jiang, Xirong Li
Previous studies about adding SFX to videos perform video to SFX matching at a holistic level, lacking the ability of adding SFX to a specific moment.
no code implementations • 28 Nov 2022 • Xirong Li, Aozhu Chen, Ziyue Wang, Fan Hu, Kaibin Tian, Xinru Chen, Chengbo Dong
The 2022 edition of the TRECVID benchmark has again been a fruitful participation for the RUCMM team.
1 code implementation • 30 Apr 2022 • Ziyue Wang, Aozhu Chen, Fan Hu, Xirong Li
We propose a learning based method for training a negation-aware video retrieval model.
1 code implementation • 3 Dec 2021 • Fan Hu, Aozhu Chen, Ziyue Wang, Fangming Zhou, Jianfeng Dong, Xirong Li
In this paper we revisit feature fusion, an old-fashioned topic, in the new context of text-to-video retrieval.
Ranked #1 on
Ad-hoc video search
on TRECVID-AVS20 (V3C1)
(using extra training data)
no code implementations • 9 Dec 2020 • Aozhu Chen, Xinyi Huang, Hailan Lin, Xirong Li
For the first scenario with the references available, we propose two metrics, i. e., WMDRel and CLinRel.