no code implementations • 23 Oct 2023 • Adeel Yousaf, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah
Consistent improvements across multiple benchmarks and with various VLMs demonstrate the effectiveness of our proposed framework.
Ranked #2 on Video-Text Retrieval on Test-of-Time
no code implementations • 25 Aug 2023 • Tristan de Blegiers, Ishan Rajendrakumar Dave, Adeel Yousaf, Mubarak Shah
Recognizing and comprehending human actions and gestures is a crucial perception requirement for robots to interact with humans and carry out tasks in diverse domains, including service robotics, healthcare, and manufacturing.