no code implementations • 5 Mar 2024 • Hanlin Tang, Yifu Sun, Decheng Wu, Kai Liu, Jianchen Zhu, Zhanhui Kang
To our best knowledge, we are the first work that achieves almost lossless quantization performance for LLMs under a data-independent setting and our algorithm runs over 10 times faster than the data-dependent methods.
no code implementations • 24 Oct 2023 • Yun Li, Lin Niu, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang
Traditional pruning methods are known to be challenging to work in Large Language Models (LLMs) for Generative AI because of their unaffordable training process and large computational demands.
no code implementations • 25 Mar 2022 • Hanlin Tang, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang
In this work, we propose MKQ-BERT, which further improves the compression level and uses 4-bits for quantization.
no code implementations • 31 Dec 2020 • Haisong Zhang, Lemao Liu, Haiyun Jiang, Yangming Li, Enbo Zhao, Kun Xu, Linfeng Song, Suncong Zheng, Botong Zhou, Jianchen Zhu, Xiao Feng, Tao Chen, Tao Yang, Dong Yu, Feng Zhang, Zhanhui Kang, Shuming Shi
This technique report introduces TexSmart, a text understanding system that supports fine-grained named entity recognition (NER) and enhanced semantic analysis functionalities.
no code implementations • 5 Nov 2019 • Jianchen Zhu, Tong Zhang, Shengjie Zhao, Carlos Hinojosa, Zengli Liu, Gonzalo R. Arce
This paper aims at developing a clustering approach with spectral images directly from CASSI compressive measurements.