Search Results for author: Unnat Jain

We present the task of "Social Rearrangement", consisting of cooperative everyday tasks like setting up the dinner table, tidying a house or unpacking groceries in a simulated multi-agent environment.

Paper
Add Code

Affordances from Human Videos as a Versatile Representation for Robotics

no code implementations • CVPR 2023 • Shikhar Bahl, Russell Mendonca, Lili Chen, Unnat Jain, Deepak Pathak

Utilizing internet videos of human behavior, we train a visual affordance model that estimates where and how in the scene a human is likely to interact.

Imitation Learning

Paper
Add Code

Pretrained Language Models as Visual Planners for Human Assistance

1 code implementation • ICCV 2023 • Dhruvesh Patel, Hamid Eghbalzadeh, Nitin Kamra, Michael Louis Iuzzolino, Unnat Jain, Ruta Desai

Given a succinct natural language goal, e. g., "make a shelf", and a video of the user's progress so far, the aim of VPA is to devise a plan, i. e., a sequence of actions such as "sand shelf", "paint shelf", etc.

Action Segmentation Language Modelling

Paper
Code

MOPA: Modular Object Navigation with PointGoal Agents

no code implementations • 7 Apr 2023 • Sonia Raychaudhuri, Tommaso Campari, Unnat Jain, Manolis Savva, Angel X. Chang

We propose a simple but effective modular approach MOPA (Modular ObjectNav with PointGoal agents) to systematically investigate the inherent modularity of the object navigation task in Embodied AI.

Navigate Object +3

Paper
Add Code

Last-Mile Embodied Visual Navigation

1 code implementation • 21 Nov 2022 • Justin Wasserman, Karmesh Yadav, Girish Chowdhary, Abhinav Gupta, Unnat Jain

Realistic long-horizon tasks like image-goal navigation involve exploratory and exploitative phases.

Visual Navigation

Paper
Code

Retrospectives on the Embodied AI Workshop

no code implementations • 13 Oct 2022 • Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi, Sonia Raychaudhuri, Mike Roberts, Silvio Savarese, Manolis Savva, Mohit Shridhar, Niko Sünderhauf, Andrew Szot, Ben Talbot, Joshua B. Tenenbaum, Jesse Thomason, Alexander Toshev, Joanne Truong, Luca Weihs, Jiajun Wu

We present a retrospective on the state of Embodied AI research.

Visual Navigation

Paper
Add Code

Learning State-Aware Visual Representations from Audible Interactions

1 code implementation • 27 Sep 2022 • Himangi Mittal, Pedro Morgado, Unnat Jain, Abhinav Gupta

However, learning representations from videos can be challenging.

Ranked #1 on Long Term Action Anticipation on Ego4D (ED@20 Noun metric)

Action Anticipation Action Recognition +3

Paper
Code

Interpretation of Emergent Communication in Heterogeneous Collaborative Embodied Agents

no code implementations • ICCV 2021 • Shivansh Patel, Saim Wani, Unnat Jain, Alexander Schwing, Svetlana Lazebnik, Manolis Savva, Angel X. Chang

We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.

Paper
Add Code

Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments

no code implementations • EMNLP 2021 • Sonia Raychaudhuri, Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang

Prior work supervises the agent with actions based on the shortest path from the agent's location to the goal, but such goal-oriented supervision is often not in alignment with the instruction.

Vision and Language Navigation

Paper
Add Code

Cooperative Exploration for Multi-Agent Deep Reinforcement Learning

no code implementations • 23 Jul 2021 • Iou-Jen Liu, Unnat Jain, Raymond A. Yeh, Alexander G. Schwing

To address this shortcoming, in this paper, we propose cooperative multi-agent exploration (CMAE): agents share a common goal while exploring.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

GridToPix: Training Embodied Agents with Minimal Supervision

no code implementations • ICCV 2021 • Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs, Alexander Schwing

While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped rewards.

PointGoal Navigation Reinforcement Learning (RL) +1

Paper
Add Code

Coordinated Multi-Agent Exploration Using Shared Goals

no code implementations • 1 Jan 2021 • Iou-Jen Liu, Unnat Jain, Alex Schwing

Exploration is critical for good results of deep reinforcement learning algorithms and has drawn much attention.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation

no code implementations • NeurIPS 2020 • Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang, Manolis Savva

We propose the multiON task, which requires navigation to an episode-specific sequence of objects in a realistic environment.

Benchmarking Object

Paper
Add Code

AllenAct: A Framework for Embodied AI Research

1 code implementation • 28 Aug 2020 • Luca Weihs, Jordi Salvador, Klemen Kotar, Unnat Jain, Kuo-Hao Zeng, Roozbeh Mottaghi, Aniruddha Kembhavi

The domain of Embodied AI, in which agents learn to complete tasks through interaction with their environment from egocentric observations, has experienced substantial growth with the advent of deep reinforcement learning and increased interest from the computer vision, NLP, and robotics communities.

Embodied Question Answering Instruction Following +1

294

Paper
Code

Bridging the Imitation Gap by Adaptive Insubordination

no code implementations • NeurIPS 2021 • Luca Weihs, Unnat Jain, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an "imitation gap" and, potentially, poor results.

Imitation Learning Memorization +2

Paper
Add Code

A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks

no code implementations • ECCV 2020 • Unnat Jain, Luca Weihs, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

Autonomous agents must learn to collaborate.

Paper
Add Code

SoundSpaces: Audio-Visual Navigation in 3D Environments

2 code implementations • ECCV 2020 • Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, Kristen Grauman

Moving around in the world is naturally a multisensory experience, but today's embodied agents are deaf---restricted to solely their visual perception of the environment.

Navigate Visual Navigation

311

Paper
Code

TAB-VCR: Tags and Attributes based VCR Baselines

1 code implementation • NeurIPS 2019 • Jingxiang Lin, Unnat Jain, Alexander Schwing

Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.

Attribute Question Answering +3

Paper
Code

TAB-VCR: Tags and Attributes based Visual Commonsense Reasoning Baselines

1 code implementation • NeurIPS 2019 • Jingxiang Lin, Unnat Jain, Alexander G. Schwing

Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.

Attribute Question Answering +3

Paper
Code

Two Body Problem: Collaborative Visual Task Completion

no code implementations • CVPR 2019 • Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi

Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities.

Task 2 Vocal Bursts Valence Prediction

Paper
Add Code

Two can play this Game: Visual Dialog with Discriminative Question Generation and Answering

no code implementations • CVPR 2018 • Unnat Jain, Svetlana Lazebnik, Alexander Schwing

In addition, for the first time on the visual dialog dataset, we assess the performance of a system asking questions, and demonstrate how visual dialog can be generated from discriminative question generation and question answering.

Ranked #7 on Visual Dialog on VisDial v0.9 val