1 code implementation • 9 Mar 2021 • Jayaprakash A, abhishek, Rishabh Dabral, Ganesh Ramakrishnan, Preethi Jyothi
Video retrieval using natural language queries requires learning semantically meaningful joint embeddings between the text and the audio-visual input.
Ranked #1 on Video Retrieval on Charades-STA