Search Results for author: Changxu Shao

Found 1 papers, 1 papers with code

vTensor: Flexible Virtual Tensor Management for Efficient LLM Serving

1 code implementation22 Jul 2024 Jiale Xu, Rui Zhang, Cong Guo, Weiming Hu, Zihan Liu, Feiyang Wu, Yu Feng, Shixuan Sun, Changxu Shao, Yuhong Guo, Junping Zhao, Ke Zhang, Minyi Guo, Jingwen Leng

This study introduces the vTensor, an innovative tensor structure for LLM inference based on GPU virtual memory management (VMM).

Management

Cannot find the paper you are looking for? You can Submit a new open access paper.