Search Results for author: Sukrit Kalra

Found 2 papers, 1 papers with code

SuperServe: Fine-Grained Inference Serving for Unpredictable Workloads

no code implementations27 Dec 2023 Alind Khare, Dhruv Garg, Sukrit Kalra, Snigdha Grandhi, Ion Stoica, Alexey Tumanov

Serving models under such conditions requires these systems to strike a careful balance between the latency and accuracy requirements of the application and the overall efficiency of utilization of scarce resources.

Scheduling

Cannot find the paper you are looking for? You can Submit a new open access paper.