Search Results for author: Xiuying Wei

Found 6 papers, 6 papers with code

QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models

2 code implementations12 Oct 2023 Jing Liu, Ruihao Gong, Xiuying Wei, Zhiwei Dong, Jianfei Cai, Bohan Zhuang

Additionally, an adaptive strategy is designed to autonomously determine the optimal number of sub-channels for channel disassembly.

Quantization

Lossy and Lossless (L$^2$) Post-training Model Size Compression

1 code implementation8 Aug 2023 Yumeng Shi, Shihao Bai, Xiuying Wei, Ruihao Gong, Jianlei Yang

Then, a dedicated differentiable counter is introduced to guide the optimization of lossy compression to arrive at a more suitable point for later lossless compression.

Lossy and Lossless (L2) Post-training Model Size Compression

1 code implementation ICCV 2023 Yumeng Shi, Shihao Bai, Xiuying Wei, Ruihao Gong, Jianlei Yang

Then, a dedicated differentiable counter is introduced to guide the optimization of lossy compression to arrive at a more suitable point for later lossless compression.

Outlier Suppression: Pushing the Limit of Low-bit Transformer Language Models

1 code implementation27 Sep 2022 Xiuying Wei, Yunchen Zhang, Xiangguo Zhang, Ruihao Gong, Shanghang Zhang, Qi Zhang, Fengwei Yu, Xianglong Liu

With the trends of large NLP models, the increasing memory and computation costs hinder their efficient deployment on resource-limited devices.

Quantization

QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization

2 code implementations11 Mar 2022 Xiuying Wei, Ruihao Gong, Yuhang Li, Xianglong Liu, Fengwei Yu

With QDROP, the limit of PTQ is pushed to the 2-bit activation for the first time and the accuracy boost can be up to 51. 49%.

Image Classification object-detection +5

Cannot find the paper you are looking for? You can Submit a new open access paper.