no code implementations • 17 Jun 2022 • Kaizhi Zheng, Xiaotong Chen, Odest Chadwicke Jenkins, Xin Eric Wang
We hope the new simulator and benchmark will facilitate future research on language-guided robotic manipulation.
no code implementations • 28 Aug 2022 • Kaizhi Zheng, Kaiwen Zhou, Jing Gu, Yue Fan, Jialu Wang, Zonglin Di, Xuehai He, Xin Eric Wang
Building a conversational embodied agent to execute real-life tasks has been a long-standing yet quite challenging research goal, as it requires effective human-agent communication, multi-modal understanding, long-range sequential decision making, etc.
no code implementations • 30 Jan 2023 • Kaiwen Zhou, Kaizhi Zheng, Connor Pryor, Yilin Shen, Hongxia Jin, Lise Getoor, Xin Eric Wang
Such object navigation tasks usually require large-scale training in visual environments with labeled objects, which generalizes poorly to novel objects in unknown environments.
no code implementations • 23 May 2023 • Yue Fan, Jing Gu, Kaizhi Zheng, Xin Eric Wang
Intelligent navigation-helper agents are critical as they can navigate users in unknown areas through environmental awareness and conversational ability, serving as potential accessibility tools for individuals with disabilities.
1 code implementation • 16 Oct 2020 • Xiaotong Chen, Kaizhi Zheng, Zhen Zeng, Cameron Kisailus, Shreshtha Basu, James Cooney, Jana Pavlasek, Odest Chadwicke Jenkins
In this work, we combine the notions of affordance and category-level pose, and introduce the Affordance Coordinate Frame (ACF).
1 code implementation • 3 Oct 2023 • Kaizhi Zheng, Xuehai He, Xin Eric Wang
The effectiveness of Multimodal Large Language Models (MLLMs) demonstrates a profound capability in multimodal understanding.