Search Results for author: Manya Ghobadi

Found 5 papers, 1 papers with code

MLTCP: Congestion Control for DNN Training

no code implementations14 Feb 2024 Sudarsanan Rajasekaran, Sanjoli Narang, Anton A. Zabreyko, Manya Ghobadi

We present MLTCP, a technique to augment today's congestion control algorithms to accelerate DNN training jobs in shared GPU clusters.

CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters

no code implementations1 Aug 2023 Sudarsanan Rajasekaran, Manya Ghobadi, Aditya Akella

We present CASSINI, a network-aware job scheduler for machine learning (ML) clusters.

Scheduling

How to Build Low-cost Networks for Large Language Models (without Sacrificing Performance)?

no code implementations22 Jul 2023 Weiyang Wang, Manya Ghobadi, Kayvon Shakeri, Ying Zhang, Naader Hasani

We show that LLMs exhibit a unique communication pattern where only small groups of GPUs require high-bandwidth communication to achieve near-optimal training performance.

Blocking Language Modelling +1

PEOPL: Characterizing Privately Encoded Open Datasets with Public Labels

no code implementations31 Mar 2023 Homa Esfahanizadeh, Adam Yala, Rafael G. L. D'Oliveira, Andrea J. D. Jaba, Victor Quach, Ken R. Duffy, Tommi S. Jaakkola, Vinod Vaikuntanathan, Manya Ghobadi, Regina Barzilay, Muriel Médard

Allowing organizations to share their data for training of machine learning (ML) models without unintended information leakage is an open problem in practice.

Cannot find the paper you are looking for? You can Submit a new open access paper.