no code implementations • 22 Nov 2024 • S. P. Sharan, Minkyu Choi, Sahil Shah, Harsh Goel, Mohammad Omama, Sandeep Chinchali
As these models become prevalent, various metrics and benchmarks have emerged to evaluate the quality of the generated videos.
no code implementations • 15 Nov 2024 • Po-han Li, Yunhao Yang, Mohammad Omama, Sandeep Chinchali, Ufuk Topcu
Autonomous agents perceive and interpret their surroundings by integrating multimodal inputs, such as vision, audio, and LiDAR.
no code implementations • 9 Oct 2024 • Mohammad Omama, Po-han Li, Sandeep P. Chinchali
Addressing efficiency, we introduce Single-shot Similarity Space Distillation ((SS)$_2$D), a novel approach to learn embeddings with adaptive sizes that offers a better trade-off between size and performance.
2 code implementations • 16 Mar 2024 • Minkyu Choi, Harsh Goel, Mohammad Omama, Yunhao Yang, Sahil Shah, Sandeep Chinchali
The unprecedented surge in video data production in recent years necessitates efficient tools to extract meaningful frames from videos for downstream tasks.
no code implementations • 27 Dec 2023 • Sai Shubodh Puligilla, Mohammad Omama, Husain Zaidi, Udit Singh Parihar, Madhava Krishna
We apply this approach to the domains of 2D image and 3D LiDAR points on the task of cross-modal localization.