Search Results for author: Qianshi Pang

Found 1 papers, 1 papers with code

Chimera: A Lossless Decoding Method for Accelerating Large Language Models Inference by Fusing all Tokens

1 code implementation24 Feb 2024 Ziqian Zeng, Jiahong Yu, Qianshi Pang, ZiHao Wang, Huiping Zhuang, HongEn Shao, Xiaofeng Zou

Within this framework, we introduce a lightweight draft model that effectively utilizes previously generated tokens to predict subsequent words.

Cannot find the paper you are looking for? You can Submit a new open access paper.