Search Results for author: Mark Zhao

Found 4 papers, 0 papers with code

SlipStream: Adapting Pipelines for Distributed Training of Large DNNs Amid Failures

no code implementations22 May 2024 Swapnil Gandhi, Mark Zhao, Athinagoras Skiadopoulos, Christos Kozyrakis

SlipStream is a system for efficient DNN training in the presence of failures, without using spare servers.

cedar: Composable and Optimized Machine Learning Input Data Pipelines

no code implementations17 Jan 2024 Mark Zhao, Emanuel Adamiak, Christos Kozyrakis

The input data pipeline is an essential component of each machine learning (ML) training job.

RecD: Deduplication for End-to-End Deep Learning Recommendation Model Training Infrastructure

no code implementations9 Nov 2022 Mark Zhao, Dhruv Choudhary, Devashish Tyagi, Ajay Somani, Max Kaplan, Sung-Han Lin, Sarunya Pumma, Jongsoo Park, Aarti Basant, Niket Agarwal, Carole-Jean Wu, Christos Kozyrakis

RecD addresses immense storage, preprocessing, and training overheads caused by feature duplication inherent in industry-scale DLRM training datasets.

Cannot find the paper you are looking for? You can Submit a new open access paper.