Search Results for author: Ayush Shrivastava

Found 7 papers, 5 papers with code

TEACh: Task-driven Embodied Agents that Chat

3 code implementations1 Oct 2021 Aishwarya Padmakumar, Jesse Thomason, Ayush Shrivastava, Patrick Lange, Anjali Narayan-Chen, Spandana Gella, Robinson Piramuthu, Gokhan Tur, Dilek Hakkani-Tur

Robots operating in human spaces must be able to engage in natural language interaction with people, both understanding and executing instructions, and using conversation to resolve ambiguity and recover from mistakes.

Dialogue Understanding

VISITRON: Visual Semantics-Aligned Interactively Trained Object-Navigator

1 code implementation Findings (ACL) 2022 Ayush Shrivastava, Karthik Gopalakrishnan, Yang Liu, Robinson Piramuthu, Gokhan Tür, Devi Parikh, Dilek Hakkani-Tür

Interactive robots navigating photo-realistic environments need to be trained to effectively leverage and handle the dynamic nature of dialogue in addition to the challenges underlying vision-and-language navigation (VLN).

Binary Classification Imitation Learning +3

Sim-to-Real Transfer for Vision-and-Language Navigation

1 code implementation7 Nov 2020 Peter Anderson, Ayush Shrivastava, Joanne Truong, Arjun Majumdar, Devi Parikh, Dhruv Batra, Stefan Lee

We study the challenging problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions.

Vision and Language Navigation

Extended Abstract: Improving Vision-and-Language Navigation with Image-Text Pairs from the Web

no code implementations ICML Workshop LaReL 2020 Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra

Following a navigation instruction such as 'Walk down the stairs and stop near the sofa' requires an agent to ground scene elements referenced via language (e. g.'stairs') to visual content in the environment (pixels corresponding to 'stairs').

Vision and Language Navigation

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web

1 code implementation ECCV 2020 Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra

Following a navigation instruction such as 'Walk down the stairs and stop at the brown sofa' requires embodied AI agents to ground scene elements referenced via language (e. g. 'stairs') to visual content in the environment (pixels corresponding to 'stairs').

Vision and Language Navigation

Cannot find the paper you are looking for? You can Submit a new open access paper.