Search Results for author: Haotong Xie

Found 1 papers, 1 papers with code

PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU

2 code implementations16 Dec 2023 Yixin Song, Zeyu Mi, Haotong Xie, Haibo Chen

This paper introduces PowerInfer, a high-speed Large Language Model (LLM) inference engine on a personal computer (PC) equipped with a single consumer-grade GPU.

Language Modelling Large Language Model

Cannot find the paper you are looking for? You can Submit a new open access paper.