Search Results for author: Anirudh Canumalla

Found 1 papers, 0 papers with code

Symphony: Optimized DNN Model Serving using Deferred Batch Scheduling

no code implementations • 14 Aug 2023 • Lequn Chen, Weixin Deng, Anirudh Canumalla, Yu Xin, Danyang Zhuo, Matthai Philipose, Arvind Krishnamurthy

However, existing model serving systems cannot achieve adequate batch sizes while meeting latency objectives as these systems eagerly dispatch requests to accelerators to minimize the accelerator idle time.

Scheduling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.