Search Results for author: Lin Niu

Found 4 papers, 2 papers with code

E-Sparse: Boosting the Large Language Model Inference through Entropy-based N:M Sparsity

no code implementations24 Oct 2023 Yun Li, Lin Niu, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang

Traditional pruning methods are known to be challenging to work in Large Language Models (LLMs) for Generative AI because of their unaffordable training process and large computational demands.

Language Modelling Large Language Model

RPTQ: Reorder-based Post-training Quantization for Large Language Models

1 code implementation3 Apr 2023 Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang, Yuzhang Shang, Guangyu Sun, Qiang Wu, Jiaxiang Wu, Bingzhe Wu

In this paper, we identify that the challenge in quantizing activations in LLMs arises from varying ranges across channels, rather than solely the presence of outliers.

Quantization

PD-Quant: Post-Training Quantization based on Prediction Difference Metric

1 code implementation CVPR 2023 Jiawei Liu, Lin Niu, Zhihang Yuan, Dawei Yang, Xinggang Wang, Wenyu Liu

It determines the quantization parameters by using the information of differences between network prediction before and after quantization.

Neural Network Compression Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.