no code implementations • 3 Jan 2024 • Chen Tang, Yuan Meng, Jiacheng Jiang, Shuzhao Xie, Rongwei Lu, Xinzhu Ma, Zhi Wang, Wenwu Zhu
Conversely, mixed-precision quantization (MPQ) is advocated to compress the model effectively by allocating heterogeneous bit-width for layers.
no code implementations • 13 Nov 2023 • Rongwei Lu, Yutong Jiang, Yinan Mao, Chen Tang, Bin Chen, Laizhong Cui, Zhi Wang
Assigning varying compression ratios to workers with distinct data distributions and volumes is thus a promising solution.