no code implementations • 9 Sep 2024 • Xingyun Hong, Yan Shao, Zhilin Wang, Manni Duan, Jin Xiongnan
The development of LLMs has greatly enhanced the intelligence and fluency of question answering, while the emergence of retrieval enhancement has enabled models to better utilize external information.
1 code implementation • 8 Jul 2024 • Weiming Li, Manni Duan, Dong An, Yan Shao
The experimental results reveal that the layout understanding ability of LLMs is mainly introduced by the coding data for pretraining, which is further enhanced at the instruction-tuning stage.
no code implementations • 4 Feb 2024 • Yuzhu Wang, Lechao Cheng, Chaowei Fang, Dingwen Zhang, Manni Duan, Meng Wang
Inspired by the observation that the prompt tokens tend to share high mutual information with patch tokens, we propose initializing prompts with downstream token prototypes.
Ranked #1 on Visual Prompt Tuning on VTAB-1k(Structured<8>)
1 code implementation • 26 May 2023 • Yuzhu Wang, Lechao Cheng, Manni Duan, Yongheng Wang, Zunlei Feng, Shu Kong
Finally, we propose a rather simple loss term (dubbed ND loss) to simultaneously (1) encourage student to produce large-\emph{norm} features, and (2) align the \emph{direction} of student features and teacher class-means.
Ranked #1 on Knowledge Distillation on ImageNet
1 code implementation • CVPR 2022 • Xi Chen, Zhiyan Zhao, Yilei Zhang, Manni Duan, Donglian Qi, Hengshuang Zhao
To make the model work with preexisting masks, we formulate a sub-task termed Interactive Mask Correction, and propose Progressive Merge as the solution.
Ranked #2 on Interactive Segmentation on DAVIS (using extra training data)
no code implementations • ICCV 2021 • Xi Chen, Zhiyan Zhao, Feiwu Yu, Yilei Zhang, Manni Duan
In click-based interactive segmentation, the mask extraction process is dictated by positive/negative user clicks; however, most existing methods do not fully exploit the user cues, requiring excessive numbers of clicks for satisfactory results.