Search Results for author: Indranil Gupta

Found 12 papers, 6 papers with code

Baechi: Fast Device Placement of Machine Learning Graphs

no code implementations20 Jan 2023 Beomyeol Jeon, Linda Cai, Chirag Shetty, Pallavi Srivastava, Jintao Jiang, Xiaolan Ke, Yitao Meng, Cong Xie, Indranil Gupta

While these result in model placements that train fast on data (i. e., low step times), learning-based model-parallelism is time-consuming, taking many hours or days to create a placement plan of operators on devices.

ZenoPS: A Distributed Learning System Integrating Communication Efficiency and Security

1 code implementation Algorithms 2022 Cong Xie, Oluwasanmi Koyejo, Indranil Gupta

Distributed machine learning is primarily motivated by the promise of increased computation power for accelerating training and mitigating privacy concerns.

BIG-bench Machine Learning

CSER: Communication-efficient SGD with Error Reset

no code implementations NeurIPS 2020 Cong Xie, Shuai Zheng, Oluwasanmi Koyejo, Indranil Gupta, Mu Li, Haibin Lin

The scalability of Distributed Stochastic Gradient Descent (SGD) is today limited by communication bottlenecks.

Zeno++: Robust Fully Asynchronous SGD

1 code implementation ICML 2020 Cong Xie, Sanmi Koyejo, Indranil Gupta

We propose Zeno++, a new robust asynchronous Stochastic Gradient Descent~(SGD) procedure which tolerates Byzantine failures of the workers.

Fall of Empires: Breaking Byzantine-tolerant SGD by Inner Product Manipulation

4 code implementations10 Mar 2019 Cong Xie, Sanmi Koyejo, Indranil Gupta

Recently, new defense techniques have been developed to tolerate Byzantine failures for distributed machine learning.

BIG-bench Machine Learning

Asynchronous Federated Optimization

1 code implementation10 Mar 2019 Cong Xie, Sanmi Koyejo, Indranil Gupta

Federated learning enables training on a massive number of edge devices.

Federated Learning

Zeno: Distributed Stochastic Gradient Descent with Suspicion-based Fault-tolerance

1 code implementation25 May 2018 Cong Xie, Oluwasanmi Koyejo, Indranil Gupta

We present Zeno, a technique to make distributed machine learning, particularly Stochastic Gradient Descent (SGD), tolerant to an arbitrary number of faulty workers.

BIG-bench Machine Learning

Phocas: dimensional Byzantine-resilient stochastic gradient descent

no code implementations23 May 2018 Cong Xie, Oluwasanmi Koyejo, Indranil Gupta

We propose a novel robust aggregation rule for distributed synchronous Stochastic Gradient Descent~(SGD) under a general Byzantine failure model.

Generalized Byzantine-tolerant SGD

no code implementations27 Feb 2018 Cong Xie, Oluwasanmi Koyejo, Indranil Gupta

We propose three new robust aggregation rules for distributed synchronous Stochastic Gradient Descent~(SGD) under a general Byzantine failure model.

Faster Distributed Synchronous SGD with Weak Synchronization

no code implementations ICLR 2018 Cong Xie, Oluwasanmi O. Koyejo, Indranil Gupta

Distributed training of deep learning is widely conducted with large neural networks and large datasets.

Cannot find the paper you are looking for? You can Submit a new open access paper.