Search Results for author: Vidhi Jain

Found 9 papers, 1 papers with code

FlexCap: Generating Rich, Localized, and Flexible Captions in Images

no code implementations18 Mar 2024 Debidatta Dwibedi, Vidhi Jain, Jonathan Tompson, Andrew Zisserman, Yusuf Aytar

The model, FlexCap, is trained to produce length-conditioned captions for input bounding boxes, and this allows control over the information density of its output, with descriptions ranging from concise object labels to detailed captions.

Attribute Dense Captioning +8

Toward General-Purpose Robots via Foundation Models: A Survey and Meta-Analysis

no code implementations14 Dec 2023 Yafei Hu, Quanting Xie, Vidhi Jain, Jonathan Francis, Jay Patrikar, Nikhil Keetha, Seungchan Kim, Yaqi Xie, Tianyi Zhang, Shibo Zhao, Yu Quan Chong, Chen Wang, Katia Sycara, Matthew Johnson-Roberson, Dhruv Batra, Xiaolong Wang, Sebastian Scherer, Zsolt Kira, Fei Xia, Yonatan Bisk

Motivated by the impressive open-set performance and content generation capabilities of web-scale, large-capacity pre-trained models (i. e., foundation models) in research fields such as Natural Language Processing (NLP) and Computer Vision (CV), we devote this survey to exploring (i) how these existing foundation models from NLP and CV can be applied to the field of robotics, and also exploring (ii) what a robotics-specific foundation model would look like.

MAEA: Multimodal Attribution for Embodied AI

no code implementations25 Jul 2023 Vidhi Jain, Jayant Sravan Tamarapalli, Sahiti Yerramilli, Yonatan Bisk

Understanding multimodal perception for embodied AI is an open question because such inputs may contain highly complementary as well as redundant information for the task.

Transformers are Adaptable Task Planners

no code implementations6 Jul 2022 Vidhi Jain, Yixin Lin, Eric Undersander, Yonatan Bisk, Akshara Rai

Every home is different, and every person likes things done in their particular way.

Attribute

Learning Embeddings that Capture Spatial Semantics for Indoor Navigation

1 code implementation31 Jul 2021 Vidhi Jain, Prakhar Agarwal, Shishir Patil, Katia Sycara

We know that humans can search for an object like a book, or a plate in an unseen house, based on the spatial semantics of bigger objects detected.

Object

Predicting Human Strategies in Simulated Search and Rescue Task

no code implementations15 Nov 2020 Vidhi Jain, Rohit Jena, Huao Li, Tejus Gupta, Dana Hughes, Michael Lewis, Katia Sycara

In our efforts to model the rescuer's mind, we begin with a simple simulated search and rescue task in Minecraft with human participants.

Cannot find the paper you are looking for? You can Submit a new open access paper.