Search Results for author: Vidhi Jain

Found 9 papers, 1 papers with code

Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers

no code implementations • 19 Mar 2024 • Vidhi Jain, Maria Attarian, Nikhil J Joshi, Ayzaan Wahid, Danny Driess, Quan Vuong, Pannag R Sanketi, Pierre Sermanet, Stefan Welker, Christine Chan, Igor Gilitschenski, Yonatan Bisk, Debidatta Dwibedi

Given a video demonstration of a manipulation task and current visual observations, Vid2Robot directly produces robot actions.

Paper
Add Code

FlexCap: Generating Rich, Localized, and Flexible Captions in Images

no code implementations • 18 Mar 2024 • Debidatta Dwibedi, Vidhi Jain, Jonathan Tompson, Andrew Zisserman, Yusuf Aytar

The model, FlexCap, is trained to produce length-conditioned captions for input bounding boxes, and this allows control over the information density of its output, with descriptions ranging from concise object labels to detailed captions.

Attribute Dense Captioning +8

Paper
Add Code

Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis

no code implementations • 14 Dec 2023 • Yafei Hu, Quanting Xie, Vidhi Jain, Jonathan Francis, Jay Patrikar, Nikhil Keetha, Seungchan Kim, Yaqi Xie, Tianyi Zhang, Shibo Zhao, Yu Quan Chong, Chen Wang, Katia Sycara, Matthew Johnson-Roberson, Dhruv Batra, Xiaolong Wang, Sebastian Scherer, Zsolt Kira, Fei Xia, Yonatan Bisk

Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models (i. e., foundation models) in research fields such as Natural Language Processing (NLP) and Computer Vision (CV), we devote this survey to exploring (i) how these existing foundation models from NLP and CV can be applied to the field of robotics, and also exploring (ii) what a robotics-specific foundation model would look like.

Paper
Add Code

MAEA: Multimodal Attribution for Embodied AI

no code implementations • 25 Jul 2023 • Vidhi Jain, Jayant Sravan Tamarapalli, Sahiti Yerramilli, Yonatan Bisk

Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task.

Paper
Add Code

HomeRobot: Open-Vocabulary Mobile Manipulation

no code implementations • 20 Jun 2023 • Sriram Yenamandra, Arun Ramachandran, Karmesh Yadav, Austin Wang, Mukul Khanna, Theophile Gervet, Tsung-Yen Yang, Vidhi Jain, Alexander William Clegg, John Turner, Zsolt Kira, Manolis Savva, Angel Chang, Devendra Singh Chaplot, Dhruv Batra, Roozbeh Mottaghi, Yonatan Bisk, Chris Paxton

HomeRobot (noun): An affordable compliant robot that navigates homes and manipulates a wide range of objects in order to complete everyday tasks.

Paper
Add Code

Spatial-Language Attention Policies for Efficient Robot Learning

no code implementations • 21 Apr 2023 • Priyam Parashar, Vidhi Jain, Xiaohan Zhang, Jay Vakil, Sam Powers, Yonatan Bisk, Chris Paxton

We see a 4x improvement over baseline in mobile manipulation setting.

Decision Making Language Modelling +1

Paper
Add Code

Transformers are Adaptable Task Planners

no code implementations • 6 Jul 2022 • Vidhi Jain, Yixin Lin, Eric Undersander, Yonatan Bisk, Akshara Rai

Every home is different, and every person likes things done in their particular way.

Attribute

Paper
Add Code

Learning Embeddings that Capture Spatial Semantics for Indoor Navigation

1 code implementation • 31 Jul 2021 • Vidhi Jain, Prakhar Agarwal, Shishir Patil, Katia Sycara

We know that humans can search for an object like a book, or a plate in an unseen house, based on the spatial semantics of bigger objects detected.

Object

Paper
Code

Predicting Human Strategies in Simulated Search and Rescue Task

no code implementations • 15 Nov 2020 • Vidhi Jain, Rohit Jena, Huao Li, Tejus Gupta, Dana Hughes, Michael Lewis, Katia Sycara

In our efforts to model the rescuer's mind, we begin with a simple simulated search and rescue task in Minecraft with human participants.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.