Search Results for author: Zhuorui Liu

Beyond the Speculative Game: A Survey of Speculative Execution in Large Language Models

The bottleneck is mainly due to the autoregressive innateness of LLMs, where tokens can only be generated sequentially during decoding.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.