Search Results for author: Sudarsanan Rajasekaran

Found 2 papers, 0 papers with code

MLTCP: Congestion Control for DNN Training

no code implementations14 Feb 2024 Sudarsanan Rajasekaran, Sanjoli Narang, Anton A. Zabreyko, Manya Ghobadi

We present MLTCP, a technique to augment today's congestion control algorithms to accelerate DNN training jobs in shared GPU clusters.

CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters

no code implementations1 Aug 2023 Sudarsanan Rajasekaran, Manya Ghobadi, Aditya Akella

We present CASSINI, a network-aware job scheduler for machine learning (ML) clusters.

Scheduling

Cannot find the paper you are looking for? You can Submit a new open access paper.