Search Results for author: Gengze Zhou

Found 3 papers, 2 papers with code

NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation

no code implementations24 Feb 2024 Jiazhao Zhang, Kunyu Wang, Rongtao Xu, Gengze Zhou, Yicong Hong, Xiaomeng Fang, Qi Wu, Zhizheng Zhang, He Wang

Vision-and-Language Navigation (VLN) stands as a key research problem of Embodied AI, aiming at enabling agents to navigate in unseen environments following linguistic instructions.

Decision Making Instruction Following +3

WebVLN: Vision-and-Language Navigation on Websites

1 code implementation25 Dec 2023 Qi Chen, Dileepa Pitawela, Chongyang Zhao, Gengze Zhou, Hsiang-Ting Chen, Qi Wu

Vision-and-Language Navigation (VLN) task aims to enable AI agents to accurately understand and follow natural language instructions to navigate through real-world environments, ultimately reaching specific target locations.

Navigate Vision and Language Navigation

NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models

1 code implementation26 May 2023 Gengze Zhou, Yicong Hong, Qi Wu

Trained with an unprecedented scale of data, large language models (LLMs) like ChatGPT and GPT-4 exhibit the emergence of significant reasoning abilities from model scaling.

Instruction Following Vision and Language Navigation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.