Search Results for author: Jiangang Kong

Found 1 papers, 0 papers with code

CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers

no code implementations10 Apr 2024 Longwei Zou, Qingyang Wang, Han Zhao, Jiangang Kong, Yi Yang, Yangdong Deng

The fast-growing large scale language models are delivering unprecedented performance on almost all natural language processing tasks.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.