no code implementations • 13 Jun 2024 • Duy-Kien Nguyen, Mahmoud Assran, Unnat Jain, Martin R. Oswald, Cees G. M. Snoek, Xinlei Chen
This work does not introduce a new method.
no code implementations • 6 Nov 2023 • Justin Wasserman, Girish Chowdhary, Abhinav Gupta, Unnat Jain
In the recent progress in embodied navigation and sim-to-robot transfer, modular policies have emerged as a de facto framework.
4 code implementations • 19 Oct 2023 • Xavier Puig, Eric Undersander, Andrew Szot, Mikael Dallaire Cote, Tsung-Yen Yang, Ruslan Partsey, Ruta Desai, Alexander William Clegg, Michal Hlavac, So Yeon Min, Vladimír Vondruš, Theophile Gervet, Vincent-Pierre Berges, John M. Turner, Oleksandr Maksymets, Zsolt Kira, Mrinal Kalakrishnan, Jitendra Malik, Devendra Singh Chaplot, Unnat Jain, Dhruv Batra, Akshara Rai, Roozbeh Mottaghi
We present Habitat 3. 0: a simulation platform for studying collaborative human-robot tasks in home environments.
no code implementations • 13 Oct 2023 • Sudeep Dasari, Mohan Kumar Srirama, Unnat Jain, Abhinav Gupta
Visual representation learning hold great promise for robotics, but is severely hampered by the scarcity and homogeneity of robotics datasets.
no code implementations • 31 May 2023 • Andrew Szot, Unnat Jain, Dhruv Batra, Zsolt Kira, Ruta Desai, Akshara Rai
We present the task of "Social Rearrangement", consisting of cooperative everyday tasks like setting up the dinner table, tidying a house or unpacking groceries in a simulated multi-agent environment.
no code implementations • CVPR 2023 • Shikhar Bahl, Russell Mendonca, Lili Chen, Unnat Jain, Deepak Pathak
Utilizing internet videos of human behavior, we train a visual affordance model that estimates where and how in the scene a human is likely to interact.
1 code implementation • ICCV 2023 • Dhruvesh Patel, Hamid Eghbalzadeh, Nitin Kamra, Michael Louis Iuzzolino, Unnat Jain, Ruta Desai
Given a succinct natural language goal, e. g., "make a shelf", and a video of the user's progress so far, the aim of VPA is to devise a plan, i. e., a sequence of actions such as "sand shelf", "paint shelf", etc.
no code implementations • 7 Apr 2023 • Sonia Raychaudhuri, Tommaso Campari, Unnat Jain, Manolis Savva, Angel X. Chang
We propose a simple but effective modular approach MOPA (Modular ObjectNav with PointGoal agents) to systematically investigate the inherent modularity of the object navigation task in Embodied AI.
1 code implementation • 21 Nov 2022 • Justin Wasserman, Karmesh Yadav, Girish Chowdhary, Abhinav Gupta, Unnat Jain
Realistic long-horizon tasks like image-goal navigation involve exploratory and exploitative phases.
no code implementations • 13 Oct 2022 • Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi, Sonia Raychaudhuri, Mike Roberts, Silvio Savarese, Manolis Savva, Mohit Shridhar, Niko Sünderhauf, Andrew Szot, Ben Talbot, Joshua B. Tenenbaum, Jesse Thomason, Alexander Toshev, Joanne Truong, Luca Weihs, Jiajun Wu
We present a retrospective on the state of Embodied AI research.
1 code implementation • 27 Sep 2022 • Himangi Mittal, Pedro Morgado, Unnat Jain, Abhinav Gupta
However, learning representations from videos can be challenging.
Ranked #3 on
Object State Change Classification
on Ego4D
no code implementations • ICCV 2021 • Shivansh Patel, Saim Wani, Unnat Jain, Alexander Schwing, Svetlana Lazebnik, Manolis Savva, Angel X. Chang
We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.
no code implementations • EMNLP 2021 • Sonia Raychaudhuri, Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang
Prior work supervises the agent with actions based on the shortest path from the agent's location to the goal, but such goal-oriented supervision is often not in alignment with the instruction.
no code implementations • 23 Jul 2021 • Iou-Jen Liu, Unnat Jain, Raymond A. Yeh, Alexander G. Schwing
To address this shortcoming, in this paper, we propose cooperative multi-agent exploration (CMAE): agents share a common goal while exploring.
no code implementations • ICCV 2021 • Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs, Alexander Schwing
While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped rewards.
no code implementations • 1 Jan 2021 • Iou-Jen Liu, Unnat Jain, Alex Schwing
Exploration is critical for good results of deep reinforcement learning algorithms and has drawn much attention.
no code implementations • NeurIPS 2020 • Saim Wani, Shivansh Patel, Unnat Jain, Angel X. Chang, Manolis Savva
We propose the multiON task, which requires navigation to an episode-specific sequence of objects in a realistic environment.
1 code implementation • 28 Aug 2020 • Luca Weihs, Jordi Salvador, Klemen Kotar, Unnat Jain, Kuo-Hao Zeng, Roozbeh Mottaghi, Aniruddha Kembhavi
The domain of Embodied AI, in which agents learn to complete tasks through interaction with their environment from egocentric observations, has experienced substantial growth with the advent of deep reinforcement learning and increased interest from the computer vision, NLP, and robotics communities.
no code implementations • NeurIPS 2021 • Luca Weihs, Unnat Jain, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an "imitation gap" and, potentially, poor results.
no code implementations • ECCV 2020 • Unnat Jain, Luca Weihs, Eric Kolve, Ali Farhadi, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing
Autonomous agents must learn to collaborate.
2 code implementations • ECCV 2020 • Changan Chen, Unnat Jain, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, Kristen Grauman
Moving around in the world is naturally a multisensory experience, but today's embodied agents are deaf---restricted to solely their visual perception of the environment.
1 code implementation • NeurIPS 2019 • Jingxiang Lin, Unnat Jain, Alexander Schwing
Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.
1 code implementation • NeurIPS 2019 • Jingxiang Lin, Unnat Jain, Alexander G. Schwing
Despite impressive recent progress that has been reported on tasks that necessitate reasoning, such as visual question answering and visual dialog, models often exploit biases in datasets.
no code implementations • CVPR 2019 • Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi
Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities.
no code implementations • CVPR 2018 • Unnat Jain, Svetlana Lazebnik, Alexander Schwing
In addition, for the first time on the visual dialog dataset, we assess the performance of a system asking questions, and demonstrate how visual dialog can be generated from discriminative question generation and question answering.
Ranked #7 on
Visual Dialog
on VisDial v0.9 val
no code implementations • 23 Sep 2017 • Unnat Jain, Vinay P. Namboodiri, Gaurav Pandey
The modified system learns (in a supervised setting) compact binary codes from image feature descriptors.
no code implementations • CVPR 2017 • Unnat Jain, Ziyu Zhang, Alexander Schwing
Generating diverse questions for given images is an important task for computational education, entertainment and AI assistants.