Search Results for author: Daning Cheng

Found 7 papers, 0 papers with code

Can the capability of Large Language Models be described by human ability? A Meta Study

no code implementations13 Apr 2025 Mingrui Zan, Yunquan Zhang, Boyang Zhang, Fangming Liu, Daning Cheng

The evaluation benchmarks are categorized into 6 primary abilities and 11 sub-abilities in human aspect.

A General Error-Theoretical Analysis Framework for Constructing Compression Strategies

no code implementations19 Feb 2025 Boyang Zhang, Daning Cheng, Yunquan Zhang, Meiqi Tu, Fangmin Liu, Jiake Tian

The exponential growth in parameter size and computational complexity of deep models poses significant challenges for efficient deployment.

Quantization

Compression for Better: A General and Stable Lossless Compression Framework

no code implementations9 Dec 2024 Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu, WenGuang Chen

A key challenge is effectively leveraging compression errors and defining the boundaries for lossless compression to minimize model loss.

Computational Efficiency Model Compression +1

Lossless Model Compression via Joint Low-Rank Factorization Optimization

no code implementations9 Dec 2024 Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu, Jiake Tian

Low-rank factorization is a popular model compression technique that minimizes the error $\delta$ between approximated and original weight matrices.

Model Compression Model Optimization

FP=xINT:A Low-Bit Series Expansion Algorithm for Post-Training Quantization

no code implementations9 Dec 2024 Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu

We introduce a deep model series expansion framework to address this issue, enabling rapid and accurate approximation of unquantized models without calibration sets or fine-tuning.

Quantization

Mixed-Precision Inference Quantization: Radically Towards Faster inference speed, Lower Storage requirement, and Lower Loss

no code implementations20 Jul 2022 Daning Cheng, WenGuang Chen

Based on the model's resilience to computational noise, model quantization is important for compressing models and improving computing speed.

Quantization

Quantization in Layer's Input is Matter

no code implementations10 Feb 2022 Daning Cheng, WenGuang Chen

In this paper, we will show that the quantization in layer's input is more important than parameters' quantization for loss function.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.