Search Results for author: Kevin Song

Found 1 papers, 0 papers with code

Seesaw: High-throughput LLM Inference via Model Re-sharding

no code implementations9 Mar 2025 Qidong Su, Wei Zhao, Xin Li, Muralidhar Andoorveedu, Chenhao Jiang, Zhanda Zhu, Kevin Song, Christina Giannoula, Gennady Pekhimenko

To improve the efficiency of distributed large language model (LLM) inference, various parallelization strategies, such as tensor and pipeline parallelism, have been proposed.

Computational Efficiency Language Modeling +3

Cannot find the paper you are looking for? You can Submit a new open access paper.