no code implementations • 2 Feb 2025 • Ye Mao, Weixun Luo, Junpeng Jing, Anlan Qiu, Krystian Mikolajczyk
The rise of vision-language foundation models marks an advancement in bridging the gap between human and machine capabilities in 3D scene reasoning.
no code implementations • 30 Sep 2024 • Junpeng Jing, Ye Mao, Anlan Qiu, Krystian Mikolajczyk
Regarding datasets, current synthetic object-based and indoor datasets are commonly used for training and benchmarking, with a lack of outdoor nature scenarios.