no code implementations • 26 Oct 2022 • Huda Alamri, Anthony Bilic, Michael Hu, Apoorva Beedu, Irfan Essa
Video-based dialog task is a challenging multimodal learning task that has received increasing attention over the past few years with state-of-the-art obtaining new performance records.
1 code implementation • 24 Oct 2022 • Apoorva Beedu, Huda Alamri, Irfan Essa
We introduce a Transformer based 6D Object Pose Estimation framework VideoPose, comprising an end-to-end attention based modelling architecture, that attends to previous frames in order to estimate accurate 6D Object Poses in videos.
no code implementations • CVPR 2019 • Huda Alamri, Vincent Cartillier, Abhishek Das, Jue Wang, Anoop Cherian, Irfan Essa, Dhruv Batra, Tim K. Marks, Chiori Hori, Peter Anderson, Stefan Lee, Devi Parikh
We introduce the task of scene-aware dialog.
2 code implementations • 25 Jan 2019 • Huda Alamri, Vincent Cartillier, Abhishek Das, Jue Wang, Anoop Cherian, Irfan Essa, Dhruv Batra, Tim K. Marks, Chiori Hori, Peter Anderson, Stefan Lee, Devi Parikh
We introduce the task of scene-aware dialog.
2 code implementations • 21 Jun 2018 • Chiori Hori, Huda Alamri, Jue Wang, Gordon Wichern, Takaaki Hori, Anoop Cherian, Tim K. Marks, Vincent Cartillier, Raphael Gontijo Lopes, Abhishek Das, Irfan Essa, Dhruv Batra, Devi Parikh
We introduce a new dataset of dialogs about videos of human behaviors.
4 code implementations • 1 Jun 2018 • Huda Alamri, Vincent Cartillier, Raphael Gontijo Lopes, Abhishek Das, Jue Wang, Irfan Essa, Dhruv Batra, Devi Parikh, Anoop Cherian, Tim K. Marks, Chiori Hori
Scene-aware dialog systems will be able to have conversations with users about the objects and events around them.