Search Results for author: Dongliang Xu

Found 19 papers, 10 papers with code

Simple Radiology VLLM Test-time Scaling with Thought Graph Traversal

1 code implementation13 Jun 2025 Yue Yao, Zelin Wen, Yan Tong, Xinyu Tian, Xuqing Li, Xiao Ma, Dongliang Xu, Tom Gedeon

Test-time scaling offers a promising way to improve the reasoning performance of vision-language large models (VLLMs) without additional training.

Beyond Similarity: A Gradient-based Graph Method for Instruction Tuning Data Selection

no code implementations16 Feb 2025 Yang Zhao, Li Du, Xiao Ding, Yangou Ouyang, Hepeng Wang, Kai Xiong, Jinglong Gao, Zhouhao Sun, Dongliang Xu, Yang Qing, Dongchen Li, Bing Qin, Ting Liu

Large language models (LLMs) have shown great potential across various industries due to their remarkable ability to generalize through instruction tuning.

Domain Adaptation Transfer Learning

Advancing Large Language Model Attribution through Self-Improving

no code implementations17 Oct 2024 Lei Huang, Xiaocheng Feng, Weitao Ma, Liang Zhao, Yuchun Fan, Weihong Zhong, Dongliang Xu, Qing Yang, Hongtao Liu, Bing Qin

Teaching large language models (LLMs) to generate text with citations to evidence sources can mitigate hallucinations and enhance verifiability in information-seeking systems.

Language Modeling Language Modelling +2

Extending Context Window of Large Language Models from a Distributional Perspective

1 code implementation2 Oct 2024 Yingsheng Wu, Yuxuan Gu, Xiaocheng Feng, Weihong Zhong, Dongliang Xu, Qing Yang, Hongtao Liu, Bing Qin

However, existing scaling methods often rely on empirical approaches and lack a profound understanding of the internal distribution within RoPE, resulting in suboptimal performance in extending the context window length.

16k 8k

CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information

1 code implementation20 Sep 2024 Yuxin Wang, Minghua Ma, Zekun Wang, Jingchang Chen, Huiming Fan, Liping Shan, Qing Yang, Dongliang Xu, Ming Liu, Bing Qin

To this end, we introduce an efficient structured pruning framework named CFSP, which leverages both Coarse (interblock) and Fine-grained (intrablock) activation information as an importance criterion to guide pruning.

Network Pruning

Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training

1 code implementation25 Jun 2024 YiXuan Wang, Xianzhen Luo, Fuxuan Wei, Yijun Liu, Qingfu Zhu, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che

To address this problem, we propose the Make Some Noise (MSN) training framework as a replacement for the supervised fine-tuning stage of the large language model.

Denoising Language Modeling +2

MoGU: A Framework for Enhancing Safety of Open-Sourced LLMs While Preserving Their Usability

1 code implementation23 May 2024 Yanrui Du, Sendong Zhao, Danyang Zhao, Ming Ma, Yuhan Chen, Liangyu Huo, Qing Yang, Dongliang Xu, Bing Qin

When encountering malicious instructions, the router will assign a higher weight to the safe LLM to ensure that responses are harmless.

Meaningful Learning: Enhancing Abstract Reasoning in Large Language Models via Generic Fact Guidance

1 code implementation14 Mar 2024 Kai Xiong, Xiao Ding, Ting Liu, Bing Qin, Dongliang Xu, Qing Yang, Hongtao Liu, Yixin Cao

The results show that our approach not only boosts the general reasoning performance of LLMs but also makes considerable strides towards their capacity for abstract reasoning, moving beyond simple memorization or imitation to a more nuanced understanding and application of generic facts.

Memorization

How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers

no code implementations4 Mar 2024 Xin Lu, Yanyan Zhao, Bing Qin, Liangyu Huo, Qing Yang, Dongliang Xu

Through analysis, we found the contribution ratio of Multi-Head Attention (a combination function) to pre-trained language modeling is a key factor affecting base capabilities.

Few-Shot Learning Language Modeling +3

Semi-Instruct: Bridging Natural-Instruct and Self-Instruct for Code Large Language Models

no code implementations1 Mar 2024 Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Xu Wang, Qing Yang, Dongliang Xu, Wanxiang Che

Presently, two dominant paradigms for collecting tuning data are natural-instruct (human-written) and self-instruct (automatically generated).

Diversity Program Synthesis

Python is Not Always the Best Choice: Embracing Multilingual Program of Thoughts

1 code implementation16 Feb 2024 Xianzhen Luo, Qingfu Zhu, Zhiming Zhang, Libo Qin, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che

In this paper, we conduct comprehensive experiments on the programming languages used in PoT and find that no single language consistently delivers optimal performance across all tasks and models.

All Diversity

SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models

no code implementations16 Jan 2024 Weixiang Zhao, Shilong Wang, Yulin Hu, Yanyan Zhao, Bing Qin, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che

Existing methods devise the learning module to acquire task-specific knowledge with parameter-efficient tuning (PET) block and the selection module to pick out the corresponding one for the testing input, aiming at handling the challenges of catastrophic forgetting and knowledge transfer in CL.

Continual Learning Transfer Learning

SmartTrim: Adaptive Tokens and Attention Pruning for Efficient Vision-Language Models

no code implementations24 May 2023 Zekun Wang, Jingchang Chen, Wangchunshu Zhou, Haichao Zhu, Jiafeng Liang, Liping Shan, Ming Liu, Dongliang Xu, Qing Yang, Bing Qin

Despite achieving remarkable performance on various vision-language tasks, Transformer-based Vision-Language Models (VLMs) suffer from redundancy in inputs and parameters, significantly hampering their efficiency in real-world applications.

Data Augmentation

XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters

1 code implementation19 May 2023 Xuanyu Zhang, Qing Yang, Dongliang Xu

In recent years, pre-trained language models have undergone rapid development with the emergence of large-scale models.

Verification Code Recognition Based on Active and Deep Learning

no code implementations12 Feb 2019 Dongliang Xu, Bailing Wang, XiaoJiang Du, Xiaoyan Zhu, zhitao Guan, Xiaoyan Yu, Jingyu Liu

However, the advantages of convolutional neural networks depend on the data used by the training classifier, particularly the size of the training set.

Deep Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.