1 code implementation • 13 Jul 2024 • Xiuying Wei, Skander Moalla, Razvan Pascanu, Caglar Gulcehre
State-of-the-art LLMs often rely on scale with high computational costs, which has sparked a research agenda to reduce parameter counts and costs without significantly impacting performance.
1 code implementation • 24 Jun 2024 • Xiuying Wei, Skander Moalla, Razvan Pascanu, Caglar Gulcehre
Additionally, we propose a novel training regime, called \textit{self-guided training}, aimed at improving the poor training dynamics that these approximations exhibit when used from initialization.
no code implementations • 10 May 2024 • Yunqian Fan, Xiuying Wei, Ruihao Gong, Yuqing Ma, Xiangguo Zhang, Qi Zhang, Xianglong Liu
In this paper, we pioneeringly investigate semantic sensitivity to post-processing for lane detection with a novel Lane Distortion Score.
1 code implementation • 9 May 2024 • Ruihao Gong, Yang Yong, Zining Wang, Jinyang Guo, Xiuying Wei, Yuqing Ma, Xianglong Liu
Previous methods for finding sparsity rates mainly focus on the training-aware scenario, which usually fails to converge stably under the PTS setting with limited data and much less training cost.
2 code implementations • 12 Oct 2023 • Jing Liu, Ruihao Gong, Xiuying Wei, Zhiwei Dong, Jianfei Cai, Bohan Zhuang
Additionally, an adaptive strategy is designed to autonomously determine the optimal number of sub-channels for channel disassembly.
1 code implementation • 8 Aug 2023 • Yumeng Shi, Shihao Bai, Xiuying Wei, Ruihao Gong, Jianlei Yang
Then, a dedicated differentiable counter is introduced to guide the optimization of lossy compression to arrive at a more suitable point for later lossless compression.
1 code implementation • 18 Apr 2023 • Xiuying Wei, Yunchen Zhang, Yuhang Li, Xiangguo Zhang, Ruihao Gong, Jinyang Guo, Xianglong Liu
The channel-wise shifting aligns the center of each channel for removal of outlier asymmetry.
1 code implementation • ICCV 2023 • Yumeng Shi, Shihao Bai, Xiuying Wei, Ruihao Gong, Jianlei Yang
Then, a dedicated differentiable counter is introduced to guide the optimization of lossy compression to arrive at a more suitable point for later lossless compression.
1 code implementation • 27 Sep 2022 • Xiuying Wei, Yunchen Zhang, Xiangguo Zhang, Ruihao Gong, Shanghang Zhang, Qi Zhang, Fengwei Yu, Xianglong Liu
With the trends of large NLP models, the increasing memory and computation costs hinder their efficient deployment on resource-limited devices.
2 code implementations • 11 Mar 2022 • Xiuying Wei, Ruihao Gong, Yuhang Li, Xianglong Liu, Fengwei Yu
With QDROP, the limit of PTQ is pushed to the 2-bit activation for the first time and the accuracy boost can be up to 51. 49%.