1 code implementation • 22 Jul 2024 • Jiale Xu, Rui Zhang, Cong Guo, Weiming Hu, Zihan Liu, Feiyang Wu, Yu Feng, Shixuan Sun, Changxu Shao, Yuhong Guo, Junping Zhao, Ke Zhang, Minyi Guo, Jingwen Leng
This study introduces the vTensor, an innovative tensor structure for LLM inference based on GPU virtual memory management (VMM).
1 code implementation • 28 Jun 2024 • Xianzhi Zeng, Zhuoyan Wu, Xinjing Hu, Xuanhua Shi, Shixuan Sun, Shuhao Zhang
Although numerous AKNN algorithms and benchmarks have been developed recently to evaluate their effectiveness, the dynamic nature of real-world data presents significant challenges that existing benchmarks fail to address.
no code implementations • 23 Mar 2021 • Johan Kok Zhi Kang, Gaurav, Sien Yi Tan, Feng Cheng, Shixuan Sun, Bingsheng He
The use of deep learning models for forecasting the resource consumption patterns of SQL queries have recently been a popular area of study.