1 code implementation • 13 Jun 2025 • Yue Yao, Zelin Wen, Yan Tong, Xinyu Tian, Xuqing Li, Xiao Ma, Dongliang Xu, Tom Gedeon
Test-time scaling offers a promising way to improve the reasoning performance of vision-language large models (VLLMs) without additional training.
no code implementations • 16 Feb 2025 • Yang Zhao, Li Du, Xiao Ding, Yangou Ouyang, Hepeng Wang, Kai Xiong, Jinglong Gao, Zhouhao Sun, Dongliang Xu, Yang Qing, Dongchen Li, Bing Qin, Ting Liu
Large language models (LLMs) have shown great potential across various industries due to their remarkable ability to generalize through instruction tuning.
no code implementations • 17 Oct 2024 • Lei Huang, Xiaocheng Feng, Weitao Ma, Liang Zhao, Yuchun Fan, Weihong Zhong, Dongliang Xu, Qing Yang, Hongtao Liu, Bing Qin
Teaching large language models (LLMs) to generate text with citations to evidence sources can mitigate hallucinations and enhance verifiability in information-seeking systems.
1 code implementation • 5 Oct 2024 • Yangfan Ye, Xiachong Feng, Xiaocheng Feng, Weitao Ma, Libo Qin, Dongliang Xu, Qing Yang, Hongtao Liu, Bing Qin
News summarization in today's global scene can be daunting with its flood of multilingual content and varied viewpoints from different sources.
1 code implementation • 2 Oct 2024 • Yingsheng Wu, Yuxuan Gu, Xiaocheng Feng, Weihong Zhong, Dongliang Xu, Qing Yang, Hongtao Liu, Bing Qin
However, existing scaling methods often rely on empirical approaches and lack a profound understanding of the internal distribution within RoPE, resulting in suboptimal performance in extending the context window length.
1 code implementation • 20 Sep 2024 • Yuxin Wang, Minghua Ma, Zekun Wang, Jingchang Chen, Huiming Fan, Liping Shan, Qing Yang, Dongliang Xu, Ming Liu, Bing Qin
To this end, we introduce an efficient structured pruning framework named CFSP, which leverages both Coarse (interblock) and Fine-grained (intrablock) activation information as an importance criterion to guide pruning.
4 code implementations • 16 Aug 2024 • Xianzhen Luo, YiXuan Wang, Qingfu Zhu, Zhiming Zhang, Xuanyu Zhang, Qing Yang, Dongliang Xu
New candidate tokens from the decoding process are then used to update the matrix.
1 code implementation • 25 Jun 2024 • YiXuan Wang, Xianzhen Luo, Fuxuan Wei, Yijun Liu, Qingfu Zhu, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che
To address this problem, we propose the Make Some Noise (MSN) training framework as a replacement for the supervised fine-tuning stage of the large language model.
1 code implementation • 23 May 2024 • Yanrui Du, Sendong Zhao, Danyang Zhao, Ming Ma, Yuhan Chen, Liangyu Huo, Qing Yang, Dongliang Xu, Bing Qin
When encountering malicious instructions, the router will assign a higher weight to the safe LLM to ensure that responses are harmless.
1 code implementation • 14 Mar 2024 • Kai Xiong, Xiao Ding, Ting Liu, Bing Qin, Dongliang Xu, Qing Yang, Hongtao Liu, Yixin Cao
The results show that our approach not only boosts the general reasoning performance of LLMs but also makes considerable strides towards their capacity for abstract reasoning, moving beyond simple memorization or imitation to a more nuanced understanding and application of generic facts.
no code implementations • 4 Mar 2024 • Xin Lu, Yanyan Zhao, Bing Qin, Liangyu Huo, Qing Yang, Dongliang Xu
Through analysis, we found the contribution ratio of Multi-Head Attention (a combination function) to pre-trained language modeling is a key factor affecting base capabilities.
no code implementations • 1 Mar 2024 • Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Xu Wang, Qing Yang, Dongliang Xu, Wanxiang Che
Presently, two dominant paradigms for collecting tuning data are natural-instruct (human-written) and self-instruct (automatically generated).
1 code implementation • 16 Feb 2024 • Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Libo Qin, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che
In this paper, we conduct comprehensive experiments on the programming languages used in PoT and find that no single language consistently delivers optimal performance across all tasks and models.
no code implementations • 16 Jan 2024 • Weixiang Zhao, Shilong Wang, Yulin Hu, Yanyan Zhao, Bing Qin, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che
Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input, aiming at handling the challenges of catastrophic forgetting and knowledge transfer in CL.
no code implementations • 28 Dec 2023 • Liang Zhao, Xiachong Feng, Xiaocheng Feng, Weihong Zhong, Dongliang Xu, Qing Yang, Hongtao Liu, Bing Qin, Ting Liu
Built upon the Transformer, large language models (LLMs) have captured worldwide attention due to their remarkable abilities.
no code implementations • 24 May 2023 • Zekun Wang, Jingchang Chen, Wangchunshu Zhou, Haichao Zhu, Jiafeng Liang, Liping Shan, Ming Liu, Dongliang Xu, Qing Yang, Bing Qin
Despite achieving remarkable performance on various vision-language tasks, Transformer-based Vision-Language Models (VLMs) suffer from redundancy in inputs and parameters, significantly hampering their efficiency in real-world applications.
1 code implementation • 19 May 2023 • Xuanyu Zhang, Qing Yang, Dongliang Xu
In recent years, pre-trained language models have undergone rapid development with the emergence of large-scale models.
no code implementations • 18 Apr 2022 • Xuanyu Zhang, Qing Yang, Dongliang Xu
Knowledge graph embedding (KGE) aims to learn continuous vectors of relations and entities in knowledge graph.
Ranked #8 on
Link Property Prediction
on ogbl-wikikg2
no code implementations • 12 Feb 2019 • Dongliang Xu, Bailing Wang, XiaoJiang Du, Xiaoyan Zhu, zhitao Guan, Xiaoyan Yu, Jingyu Liu
However, the advantages of convolutional neural networks depend on the data used by the training classifier, particularly the size of the training set.