Search Results for author: Unnat Jain

Found 16 papers, 4 papers with code

Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments

no code implementations EMNLP 2021 Sonia Raychaudhuri, Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang

Prior work supervises the agent with actions based on the shortest path from the agent's location to the goal, but such goal-oriented supervision is often not in alignment with the instruction.

Vision and Language Navigation

Cooperative Exploration for Multi-Agent Deep Reinforcement Learning

no code implementations23 Jul 2021 Iou-Jen Liu, Unnat Jain, Raymond A. Yeh, Alexander G. Schwing

To address this shortcoming, in this paper, we propose cooperative multi-agent exploration (CMAE): agents share a common goal while exploring.

reinforcement-learning SMAC+ +1

GridToPix: Training Embodied Agents with Minimal Supervision

no code implementations ICCV 2021 Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs, Alexander Schwing

While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped rewards.

PointGoal Navigation

Coordinated Multi-Agent Exploration Using Shared Goals

no code implementations1 Jan 2021 Iou-Jen Liu, Unnat Jain, Alex Schwing

Exploration is critical for good results of deep reinforcement learning algorithms and has drawn much attention.

reinforcement-learning SMAC+ +1

MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation

no code implementations NeurIPS 2020 Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang, Manolis Savva

We propose the multiON task, which requires navigation to an episode-specific sequence of objects in a realistic environment.

AllenAct: A Framework for Embodied AI Research

1 code implementation28 Aug 2020 Luca Weihs, Jordi Salvador, Klemen Kotar, Unnat Jain, Kuo-Hao Zeng, Roozbeh Mottaghi, Aniruddha Kembhavi

The domain of Embodied AI, in which agents learn to complete tasks through interaction with their environment from egocentric observations, has experienced substantial growth with the advent of deep reinforcement learning and increased interest from the computer vision, NLP, and robotics communities.

Embodied Question Answering Question Answering

Bridging the Imitation Gap by Adaptive Insubordination

no code implementations NeurIPS 2021 Luca Weihs, Unnat Jain, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an "imitation gap" and, potentially, poor results.

Imitation Learning reinforcement-learning

SoundSpaces: Audio-Visual Navigation in 3D Environments

2 code implementations ECCV 2020 Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, Kristen Grauman

Moving around in the world is naturally a multisensory experience, but today's embodied agents are deaf---restricted to solely their visual perception of the environment.

Navigate Visual Navigation

TAB-VCR: Tags and Attributes based VCR Baselines

1 code implementation NeurIPS 2019 Jingxiang Lin, Unnat Jain, Alexander Schwing

Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.

Question Answering Visual Commonsense Reasoning +2

TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines

1 code implementation NeurIPS 2019 Jingxiang Lin, Unnat Jain, Alexander G. Schwing

Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.

Question Answering Visual Commonsense Reasoning +2

Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering

no code implementations CVPR 2018 Unnat Jain, Svetlana Lazebnik, Alexander Schwing

In addition, for the first time on the visual dialog dataset, we assess the performance of a system asking questions, and demonstrate how visual dialog can be generated from discriminative question generation and question answering.

Image Captioning Question Answering +3

Compact Environment-Invariant Codes for Robust Visual Place Recognition

no code implementations23 Sep 2017 Unnat Jain, Vinay P. Namboodiri, Gaurav Pandey

The modified system learns (in a supervised setting) compact binary codes from image feature descriptors.

Visual Place Recognition

Creativity: Generating Diverse Questions using Variational Autoencoders

no code implementations CVPR 2017 Unnat Jain, Ziyu Zhang, Alexander Schwing

Generating diverse questions for given images is an important task for computational education, entertainment and AI assistants.

Question Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.