1 code implementation • CVPR 2022 • Satya Krishna Gorti, Noel Vouitsis, Junwei Ma, Keyvan Golestan, Maksims Volkovs, Animesh Garg, Guangwei Yu
Instead, texts often capture sub-regions of entire videos and are most semantically similar to certain frames within videos.
Ranked #15 on Video Retrieval on LSMDC (using extra training data)