Search Results for author: Xipeng Zhang

Found 2 papers, 0 papers with code

E-Sparse: Boosting the Large Language Model Inference through Entropy-based N:M Sparsity

no code implementations24 Oct 2023 Yun Li, Lin Niu, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang

Traditional pruning methods are known to be challenging to work in Large Language Models (LLMs) for Generative AI because of their unaffordable training process and large computational demands.

Language Modelling Large Language Model

MKQ-BERT: Quantized BERT with 4-bits Weights and Activations

no code implementations25 Mar 2022 Hanlin Tang, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang

In this work, we propose MKQ-BERT, which further improves the compression level and uses 4-bits for quantization.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.