no code implementations • 19 Feb 2025 • Boyang Zhang, Daning Cheng, Yunquan Zhang, Meiqi Tu, Fangmin Liu, Jiake Tian
The exponential growth in parameter size and computational complexity of deep models poses significant challenges for efficient deployment.
no code implementations • 9 Dec 2024 • Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu, Jiake Tian
Low-rank factorization is a popular model compression technique that minimizes the error $\delta$ between approximated and original weight matrices.