no code implementations • 13 Sep 2023 • Zhihang Ren, Jefferson Ortega, Yifan Wang, Zhimin Chen, Yunhui Guo, Stella X. Yu, David Whitney
Along with the dataset, we propose a new computer vision task to infer the affect of the selected character via both context and character information in each video frame.
2 code implementations • COLING 2020 • Seungwhan Moon, Satwik Kottur, Paul A. Crook, Ankita De, Shivani Poddar, Theodore Levin, David Whitney, Daniel Difranco, Ahmad Beirami, Eunjoon Cho, Rajen Subba, Alborz Geramifard
Next generation virtual assistants are envisioned to handle multimodal inputs (e. g., vision, memories of previous interactions, in addition to the user's utterances), and perform multimodal actions (e. g., displaying a route in addition to generating the system's utterance).
no code implementations • 7 Nov 2019 • Paul A. Crook, Shivani Poddar, Ankita De, Semir Shafi, David Whitney, Alborz Geramifard, Rajen Subba
To this end, we introduce SIMMC, an extension to ParlAI for multi-modal conversational data collection and system evaluation.
1 code implementation • 24 Mar 2019 • Ye Xia, Jinkyu Kim, John Canny, Karl Zipser, David Whitney
Inspired by human vision, we propose a new periphery-fovea multi-resolution driving model that predicts vehicle speed from dash camera videos.
no code implementations • 11 Dec 2017 • Maryam Fazel-Zarandi, Shang-Wen Li, Jin Cao, Jared Casale, Peter Henderson, David Whitney, Alborz Geramifard
In this paper, we focus on learning robust dialog policies to recover from these errors.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
2 code implementations • 17 Nov 2017 • Ye Xia, Danqing Zhang, Jinkyu Kim, Ken Nakayama, Karl Zipser, David Whitney
Because critical driving moments are so rare, collecting enough data for these situations is difficult with the conventional in-car data collection protocol---tracking eye movements during driving.