Search Results for author: Shengyu Liu

Found 2 papers, 1 papers with code

LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism

no code implementations • 15 Apr 2024 • Bingyang Wu, Shengyu Liu, Yinmin Zhong, Peng Sun, Xuanzhe Liu, Xin Jin

The context window of large language models (LLMs) is rapidly increasing, leading to a huge variance in resource usage between different requests as well as between different phases of the same request.

Paper
Add Code

Learning Accurate Performance Predictors for Ultrafast Automated Model Compression

1 code implementation • 13 Apr 2023 • Ziwei Wang, Jiwen Lu, Han Xiao, Shengyu Liu, Jie zhou

On the contrary, we obtain the optimal efficient networks by directly optimizing the compression policy with an accurate performance predictor, where the ultrafast automated model compression for various computational cost constraint is achieved without complex compression policy search and evaluation.

Image Classification Model Compression +3

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.