Search Results for author: Unnat Jain

Found 26 papers, 8 papers with code

Exploitation-Guided Exploration for Semantic Embodied Navigation

no code implementations6 Nov 2023 Justin Wasserman, Girish Chowdhary, Abhinav Gupta, Unnat Jain

In the recent progress in embodied navigation and sim-to-robot transfer, modular policies have emerged as a de facto framework.


An Unbiased Look at Datasets for Visuo-Motor Pre-Training

no code implementations13 Oct 2023 Sudeep Dasari, Mohan Kumar Srirama, Unnat Jain, Abhinav Gupta

Visual representation learning hold great promise for robotics, but is severely hampered by the scarcity and homogeneity of robotics datasets.

Representation Learning

Adaptive Coordination in Social Embodied Rearrangement

no code implementations31 May 2023 Andrew Szot, Unnat Jain, Dhruv Batra, Zsolt Kira, Ruta Desai, Akshara Rai

We present the task of "Social Rearrangement", consisting of cooperative everyday tasks like setting up the dinner table, tidying a house or unpacking groceries in a simulated multi-agent environment.

Pretrained Language Models as Visual Planners for Human Assistance

1 code implementation ICCV 2023 Dhruvesh Patel, Hamid Eghbalzadeh, Nitin Kamra, Michael Louis Iuzzolino, Unnat Jain, Ruta Desai

Given a succinct natural language goal, e. g., "make a shelf", and a video of the user's progress so far, the aim of VPA is to devise a plan, i. e., a sequence of actions such as "sand shelf", "paint shelf", etc.

Action Segmentation Language Modelling

Affordances from Human Videos as a Versatile Representation for Robotics

no code implementations CVPR 2023 Shikhar Bahl, Russell Mendonca, Lili Chen, Unnat Jain, Deepak Pathak

Utilizing internet videos of human behavior, we train a visual affordance model that estimates where and how in the scene a human is likely to interact.

Imitation Learning

MOPA: Modular Object Navigation with PointGoal Agents

no code implementations7 Apr 2023 Sonia Raychaudhuri, Tommaso Campari, Unnat Jain, Manolis Savva, Angel X. Chang

We propose a simple but effective modular approach MOPA (Modular ObjectNav with PointGoal agents) to systematically investigate the inherent modularity of the object navigation task in Embodied AI.

Navigate Object +3

Last-Mile Embodied Visual Navigation

1 code implementation21 Nov 2022 Justin Wasserman, Karmesh Yadav, Girish Chowdhary, Abhinav Gupta, Unnat Jain

Realistic long-horizon tasks like image-goal navigation involve exploratory and exploitative phases.

Visual Navigation

Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments

no code implementations EMNLP 2021 Sonia Raychaudhuri, Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang

Prior work supervises the agent with actions based on the shortest path from the agent's location to the goal, but such goal-oriented supervision is often not in alignment with the instruction.

Vision and Language Navigation

Cooperative Exploration for Multi-Agent Deep Reinforcement Learning

no code implementations23 Jul 2021 Iou-Jen Liu, Unnat Jain, Raymond A. Yeh, Alexander G. Schwing

To address this shortcoming, in this paper, we propose cooperative multi-agent exploration (CMAE): agents share a common goal while exploring.

reinforcement-learning Reinforcement Learning (RL) +2

GridToPix: Training Embodied Agents with Minimal Supervision

no code implementations ICCV 2021 Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs, Alexander Schwing

While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped rewards.

PointGoal Navigation Reinforcement Learning (RL) +1

Coordinated Multi-Agent Exploration Using Shared Goals

no code implementations1 Jan 2021 Iou-Jen Liu, Unnat Jain, Alex Schwing

Exploration is critical for good results of deep reinforcement learning algorithms and has drawn much attention.

reinforcement-learning Reinforcement Learning (RL) +2

MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation

no code implementations NeurIPS 2020 Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang, Manolis Savva

We propose the multiON task, which requires navigation to an episode-specific sequence of objects in a realistic environment.

Benchmarking Object

AllenAct: A Framework for Embodied AI Research

1 code implementation28 Aug 2020 Luca Weihs, Jordi Salvador, Klemen Kotar, Unnat Jain, Kuo-Hao Zeng, Roozbeh Mottaghi, Aniruddha Kembhavi

The domain of Embodied AI, in which agents learn to complete tasks through interaction with their environment from egocentric observations, has experienced substantial growth with the advent of deep reinforcement learning and increased interest from the computer vision, NLP, and robotics communities.

Embodied Question Answering Instruction Following +1

Bridging the Imitation Gap by Adaptive Insubordination

no code implementations NeurIPS 2021 Luca Weihs, Unnat Jain, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an "imitation gap" and, potentially, poor results.

Imitation Learning Memorization +2

SoundSpaces: Audio-Visual Navigation in 3D Environments

2 code implementations ECCV 2020 Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, Kristen Grauman

Moving around in the world is naturally a multisensory experience, but today's embodied agents are deaf---restricted to solely their visual perception of the environment.

Navigate Visual Navigation

TAB-VCR: Tags and Attributes based VCR Baselines

1 code implementation NeurIPS 2019 Jingxiang Lin, Unnat Jain, Alexander Schwing

Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.

Attribute Question Answering +3

TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines

1 code implementation NeurIPS 2019 Jingxiang Lin, Unnat Jain, Alexander G. Schwing

Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.

Attribute Question Answering +3

Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering

no code implementations CVPR 2018 Unnat Jain, Svetlana Lazebnik, Alexander Schwing

In addition, for the first time on the visual dialog dataset, we assess the performance of a system asking questions, and demonstrate how visual dialog can be generated from discriminative question generation and question answering.

Image Captioning Question Answering +4

Compact Environment-Invariant Codes for Robust Visual Place Recognition

no code implementations23 Sep 2017 Unnat Jain, Vinay P. Namboodiri, Gaurav Pandey

The modified system learns (in a supervised setting) compact binary codes from image feature descriptors.

Visual Place Recognition

Creativity: Generating Diverse Questions using Variational Autoencoders

no code implementations CVPR 2017 Unnat Jain, Ziyu Zhang, Alexander Schwing

Generating diverse questions for given images is an important task for computational education, entertainment and AI assistants.

Question Generation Question-Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.