no code implementations • 13 Apr 2025 • Mingrui Zan, Yunquan Zhang, Boyang Zhang, Fangming Liu, Daning Cheng
The evaluation benchmarks are categorized into 6 primary abilities and 11 sub-abilities in human aspect.
no code implementations • 19 Feb 2025 • Boyang Zhang, Daning Cheng, Yunquan Zhang, Meiqi Tu, Fangmin Liu, Jiake Tian
The exponential growth in parameter size and computational complexity of deep models poses significant challenges for efficient deployment.
no code implementations • 9 Dec 2024 • Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu, WenGuang Chen
A key challenge is effectively leveraging compression errors and defining the boundaries for lossless compression to minimize model loss.
no code implementations • 9 Dec 2024 • Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu, Jiake Tian
Low-rank factorization is a popular model compression technique that minimizes the error $\delta$ between approximated and original weight matrices.
no code implementations • 9 Dec 2024 • Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu
We introduce a deep model series expansion framework to address this issue, enabling rapid and accurate approximation of unquantized models without calibration sets or fine-tuning.
no code implementations • 20 Jul 2022 • Daning Cheng, WenGuang Chen
Based on the model's resilience to computational noise, model quantization is important for compressing models and improving computing speed.
no code implementations • 10 Feb 2022 • Daning Cheng, WenGuang Chen
In this paper, we will show that the quantization in layer's input is more important than parameters' quantization for loss function.