1 code implementation • 16 Jun 2020 • Andrew Rouditchenko, Angie Boggust, David Harwath, Brian Chen, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Hilde Kuehne, Rameswar Panda, Rogerio Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James Glass
Further, we propose a tri-modal model that jointly processes raw audio, video, and text captions from videos to learn a multi-modal semantic embedding space useful for text-video retrieval.
no code implementations • 3 Oct 2019 • Sicheng Zhao, Shangfei Wang, Mohammad Soleymani, Dhiraj Joshi, Qiang Ji
Affective computing (AC) of these data can help to understand human behaviors and enable wide applications.
2 code implementations • ICCV 2019 • Khoi-Nguyen C. Mac, Dhiraj Joshi, Raymond A. Yeh, JinJun Xiong, Rogerio S. Feris, Minh N. Do
Fine-grained action detection is an important task with numerous applications in robotics and human-computer interaction.
no code implementations • 22 Jul 2017 • Michele Merler, Dhiraj Joshi, Quoc-Bao Nguyen, Stephen Hammer, John Kent, John R. Smith, Rogerio S. Feris
The production of sports highlight packages summarizing a game's most exciting moments is an essential task for broadcast media.