no code implementations • 4 Sep 2024 • Xiongkun Linghu, Jiangyong Huang, Xuesong Niu, Xiaojian Ma, Baoxiong Jia, Siyuan Huang
Comprehensive evaluations on MSQA and MSNN highlight the limitations of existing vision-language models and underscore the importance of handling multi-modal interleaved inputs and situation modeling.
1 code implementation • 18 Nov 2023 • Jiangyong Huang, Silong Yong, Xiaojian Ma, Xiongkun Linghu, Puhao Li, Yan Wang, Qing Li, Song-Chun Zhu, Baoxiong Jia, Siyuan Huang
However, several significant challenges remain: (i) most of these models rely on 2D images yet exhibit a limited capacity for 3D input; (ii) these models rarely explore the tasks inherently defined in 3D world, e. g., 3D grounding, embodied reasoning and acting.
no code implementations • 19 Jul 2022 • Xiongkun Linghu, Yan Bai, Yihang Lou, Shengsen Wu, Jinze Li, Jianzhong He, Tao Bai
Few-Shot Classification(FSC) aims to generalize from base classes to novel classes given very limited labeled samples, which is an important step on the path toward human-like machine learning.
no code implementations • 3 Jul 2022 • Jinze Li, Yan Bai, Yihang Lou, Xiongkun Linghu, Jianzhong He, Shaoyun Xu, Tao Bai
The difficulties are that training on a sequence of limited data from new tasks leads to severe overfitting issues and causes the well-known catastrophic forgetting problem.
class-incremental learning Few-Shot Class-Incremental Learning +1
no code implementations • CVPR 2023 • Shengsen Wu, Yan Bai, Yihang Lou, Xiongkun Linghu, Jianzhong He, Ling-Yu Duan
Existing research mainly focuses on the one-to-one compatible paradigm, which is limited in learning compatibility among multiple models.