Search Results for author: Ayush Shrivastava

Found 7 papers, 5 papers with code

EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata

no code implementations • CVPR 2023 • Chenhao Zheng, Ayush Shrivastava, Andrew Owens

We learn a visual representation that captures information about the camera that recorded a given photo.

Paper
Add Code

TEACh: Task-driven Embodied Agents that Chat

3 code implementations • 1 Oct 2021 • Aishwarya Padmakumar, Jesse Thomason, Ayush Shrivastava, Patrick Lange, Anjali Narayan-Chen, Spandana Gella, Robinson Piramuthu, Gokhan Tur, Dilek Hakkani-Tur

Robots operating in human spaces must be able to engage in natural language interaction with people, both understanding and executing instructions, and using conversation to resolve ambiguity and recover from mistakes.

Dialogue Understanding

126

Paper
Code

VISITRON: Visual Semantics-Aligned Interactively Trained Object-Navigator

1 code implementation • Findings (ACL) 2022 • Ayush Shrivastava, Karthik Gopalakrishnan, Yang Liu, Robinson Piramuthu, Gokhan Tür, Devi Parikh, Dilek Hakkani-Tür

Interactive robots navigating photo-realistic environments need to be trained to effectively leverage and handle the dynamic nature of dialogue in addition to the challenges underlying vision-and-language navigation (VLN).

Binary Classification Imitation Learning +3

Paper
Code

Sim-to-Real Transfer for Vision-and-Language Navigation

1 code implementation • 7 Nov 2020 • Peter Anderson, Ayush Shrivastava, Joanne Truong, Arjun Majumdar, Devi Parikh, Dhruv Batra, Stefan Lee

We study the challenging problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions.

Vision and Language Navigation

Paper
Code

Extended Abstract: Improving Vision-and-Language Navigation with Image-Text Pairs from the Web

no code implementations • ICML Workshop LaReL 2020 • Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra

Following a navigation instruction such as 'Walk down the stairs and stop near the sofa' requires an agent to ground scene elements referenced via language (e. g.'stairs') to visual content in the environment (pixels corresponding to 'stairs').

Vision and Language Navigation

Paper
Add Code

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web

1 code implementation • ECCV 2020 • Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh, Dhruv Batra

Following a navigation instruction such as 'Walk down the stairs and stop at the brown sofa' requires embodied AI agents to ground scene elements referenced via language (e. g. 'stairs') to visual content in the environment (pixels corresponding to 'stairs').

Ranked #6 on Vision and Language Navigation on VLN Challenge

Vision and Language Navigation

Paper
Code

Chasing Ghosts: Instruction Following as Bayesian State Tracking

1 code implementation • NeurIPS 2019 • Peter Anderson, Ayush Shrivastava, Devi Parikh, Dhruv Batra, Stefan Lee

Our experiments show that our approach outperforms a strong LingUNet baseline when predicting the goal location on the map.

Instruction Following Vision and Language Navigation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.