Search Results for author: Xinchen Jin

Found 1 papers, 1 papers with code

Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping

1 code implementation24 Jun 2023 Daniel Zou, Xinchen Jin, Xueyang Yu, Hao Zhang, James Demmel

In anticipation of workloads that involve serving many of such large models to handle different tasks, we develop Computron, a system that uses memory swapping to serve multiple distributed models on a shared GPU cluster.

Cannot find the paper you are looking for? You can Submit a new open access paper.