no code implementations • CVPR 2023 • Chenhao Zheng, Ayush Shrivastava, Andrew Owens
We learn a visual representation that captures information about the camera that recorded a given photo.
3 code implementations • 1 Oct 2021 • Aishwarya Padmakumar, Jesse Thomason, Ayush Shrivastava, Patrick Lange, Anjali Narayan-Chen, Spandana Gella, Robinson Piramuthu, Gokhan Tur, Dilek Hakkani-Tur
Robots operating in human spaces must be able to engage in natural language interaction with people, both understanding and executing instructions, and using conversation to resolve ambiguity and recover from mistakes.
1 code implementation • Findings (ACL) 2022 • Ayush Shrivastava, Karthik Gopalakrishnan, Yang Liu, Robinson Piramuthu, Gokhan Tür, Devi Parikh, Dilek Hakkani-Tür
Interactive robots navigating photo-realistic environments need to be trained to effectively leverage and handle the dynamic nature of dialogue in addition to the challenges underlying vision-and-language navigation (VLN).
1 code implementation • 7 Nov 2020 • Peter Anderson, Ayush Shrivastava, Joanne Truong, Arjun Majumdar, Devi Parikh, Dhruv Batra, Stefan Lee
We study the challenging problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions.
no code implementations • ICML Workshop LaReL 2020 • Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra
Following a navigation instruction such as 'Walk down the stairs and stop near the sofa' requires an agent to ground scene elements referenced via language (e. g.'stairs') to visual content in the environment (pixels corresponding to 'stairs').
1 code implementation • ECCV 2020 • Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra
Following a navigation instruction such as 'Walk down the stairs and stop at the brown sofa' requires embodied AI agents to ground scene elements referenced via language (e. g. 'stairs') to visual content in the environment (pixels corresponding to 'stairs').
Ranked #6 on Vision and Language Navigation on VLN Challenge
1 code implementation • NeurIPS 2019 • Peter Anderson, Ayush Shrivastava, Devi Parikh, Dhruv Batra, Stefan Lee
Our experiments show that our approach outperforms a strong LingUNet baseline when predicting the goal location on the map.