2 code implementations • 10 Oct 2024 • Linhui Xiao, Xiaoshan Yang, Fang Peng, YaoWei Wang, Changsheng Xu
Simultaneously, the current mask visual language modeling (MVLM) fails to capture the nuanced referential relationship between image-text in referring tasks.
1 code implementation • 19 Jun 2024 • Qiang Hu, Zhenyu Yi, Ying Zhou, Fang Peng, Mei Liu, Qiang Li, Zhiwei Wang
In this context, we focus on robust video polyp segmentation by enhancing the adjacent feature consistency and rebuilding the reliable polyp representation.
Ranked #3 on
Video Polyp Segmentation
on SUN-SEG-Easy (Unseen)
1 code implementation • 20 Apr 2024 • Linhui Xiao, Xiaoshan Yang, Fang Peng, YaoWei Wang, Changsheng Xu
The cross-modal bridge can address the inconsistency between visual features and those required for grounding, and establish a connection between multi-level visual and text features.
2 code implementations • 15 May 2023 • Linhui Xiao, Xiaoshan Yang, Fang Peng, Ming Yan, YaoWei Wang, Changsheng Xu
In order to utilize vision and language pre-trained models to address the grounding problem, and reasonably take advantage of pseudo-labels, we propose CLIP-VG, a novel method that can conduct self-paced curriculum adapting of CLIP with pseudo-language labels.
1 code implementation • 28 Nov 2022 • Fang Peng, Xiaoshan Yang, Linhui Xiao, YaoWei Wang, Changsheng Xu
Although significant progress has been made in few-shot learning, most of existing few-shot image classification methods require supervised pre-training on a large amount of samples of base classes, which limits their generalization ability in real world application.
4 code implementations • 26 Apr 2021 • Wei Zeng, Xiaozhe Ren, Teng Su, Hui Wang, Yi Liao, Zhiwei Wang, Xin Jiang, ZhenZhang Yang, Kaisheng Wang, Xiaoda Zhang, Chen Li, Ziyan Gong, Yifan Yao, Xinjing Huang, Jun Wang, Jianfeng Yu, Qi Guo, Yue Yu, Yan Zhang, Jin Wang, Hengtao Tao, Dasen Yan, Zexuan Yi, Fang Peng, Fangqing Jiang, Han Zhang, Lingfeng Deng, Yehong Zhang, Zhe Lin, Chao Zhang, Shaojie Zhang, Mingyue Guo, Shanzhi Gu, Gaojun Fan, YaoWei Wang, Xuefeng Jin, Qun Liu, Yonghong Tian
To enhance the generalization ability of PanGu-$\alpha$, we collect 1. 1TB high-quality Chinese data from a wide range of domains to pretrain the model.
Ranked #1 on
Reading Comprehension (One-Shot)
on DuReader
Cloze (multi-choices) (Few-Shot)
Cloze (multi-choices) (One-Shot)
+19