Search Results for author: Xiuying Wei

Found 6 papers, 6 papers with code

QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models

2 code implementations • 12 Oct 2023 • Jing Liu, Ruihao Gong, Xiuying Wei, Zhiwei Dong, Jianfei Cai, Bohan Zhuang

Additionally, an adaptive strategy is designed to autonomously determine the optimal number of sub-channels for channel disassembly.

Quantization

Paper
Code

Lossy and Lossless (L$^2$) Post-training Model Size Compression

1 code implementation • 8 Aug 2023 • Yumeng Shi, Shihao Bai, Xiuying Wei, Ruihao Gong, Jianlei Yang

Then, a dedicated differentiable counter is introduced to guide the optimization of lossy compression to arrive at a more suitable point for later lossless compression.

Paper
Code

Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling

1 code implementation • 18 Apr 2023 • Xiuying Wei, Yunchen Zhang, Yuhang Li, Xiangguo Zhang, Ruihao Gong, Jinyang Guo, Xianglong Liu

The channel-wise shifting aligns the center of each channel for removal of outlier asymmetry.

Quantization

Paper
Code

Lossy and Lossless (L2) Post-training Model Size Compression

1 code implementation • ICCV 2023 • Yumeng Shi, Shihao Bai, Xiuying Wei, Ruihao Gong, Jianlei Yang

Then, a dedicated differentiable counter is introduced to guide the optimization of lossy compression to arrive at a more suitable point for later lossless compression.

Paper
Code

Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models

1 code implementation • 27 Sep 2022 • Xiuying Wei, Yunchen Zhang, Xiangguo Zhang, Ruihao Gong, Shanghang Zhang, Qi Zhang, Fengwei Yu, Xianglong Liu

With the trends of large NLP models, the increasing memory and computation costs hinder their efficient deployment on resource-limited devices.

Quantization

Paper
Code

QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization

2 code implementations • 11 Mar 2022 • Xiuying Wei, Ruihao Gong, Yuhang Li, Xianglong Liu, Fengwei Yu

With QDROP, the limit of PTQ is pushed to the 2-bit activation for the first time and the accuracy boost can be up to 51. 49%.

Image Classification object-detection +5

718

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.