Search Results for author: Binhao Wu

Found 4 papers, 4 papers with code

EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks with Large Vision-Language Models

1 code implementation9 Jun 2024 Mengfei Du, Binhao Wu, Zejun Li, Xuanjing Huang, Zhongyu Wei

The recent rapid development of Large Vision-Language Models (LVLMs) has indicated their potential for embodied tasks. However, the critical skill of spatial understanding in embodied environments has not been thoroughly evaluated, leaving the gap between current LVLMs and qualified embodied intelligence unknown.

Benchmarking

DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning

1 code implementation2 Apr 2024 Mengfei Du, Binhao Wu, Jiwen Zhang, Zhihao Fan, Zejun Li, Ruipu Luo, Xuanjing Huang, Zhongyu Wei

For task completion, the agent needs to align and integrate various navigation modalities, including instruction, observation and navigation history.

Contrastive Learning Decision Making +2

Cannot find the paper you are looking for? You can Submit a new open access paper.