Search Results for author: Sai Praneeth Karimireddy

Found 27 papers, 16 papers with code

Optimization with access to auxiliary information

1 code implementation1 Jun 2022 El Mahdi Chayti, Sai Praneeth Karimireddy

We investigate the fundamental optimization question of minimizing a target function $f(x)$ whose gradients are expensive to compute or have limited availability, given access to some auxiliary side function $h(x)$ whose gradients are cheap or more available.

Federated Learning Transfer Learning

Agree to Disagree: Diversity through Disagreement for Better Transferability

1 code implementation9 Feb 2022 Matteo Pagliardini, Martin Jaggi, François Fleuret, Sai Praneeth Karimireddy

This behavior can hinder the transferability of trained models by (i) favoring the learning of simpler but spurious features -- present in the training data but absent from the test data -- and (ii) by only leveraging a small subset of predictive features.

OOD Detection

Byzantine-Robust Decentralized Learning via Self-Centered Clipping

1 code implementation3 Feb 2022 Lie He, Sai Praneeth Karimireddy, Martin Jaggi

In this paper, we study the challenging task of Byzantine-robust decentralized training on arbitrary communication graphs.

Federated Learning

Breaking the centralized barrier for cross-device federated learning

no code implementations NeurIPS 2021 Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank Reddi, Sebastian U. Stich, Ananda Theertha Suresh

Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which gives rise to the client drift phenomenon.

Federated Learning

Linear Speedup in Personalized Collaborative Learning

1 code implementation10 Nov 2021 El Mahdi Chayti, Sai Praneeth Karimireddy, Sebastian U. Stich, Nicolas Flammarion, Martin Jaggi

Collaborative training can improve the accuracy of a model for a user by trading off the model's bias (introduced by using data from other users who are potentially different) against its variance (due to the limited amount of data on any single user).

Federated Learning Stochastic Optimization

Towards Model Agnostic Federated Learning Using Knowledge Distillation

no code implementations ICLR 2022 Andrei Afonin, Sai Praneeth Karimireddy

Is it possible to design an universal API for federated learning using which an ad-hoc group of data-holders (agents) collaborate with each other and perform federated learning?

Federated Learning Knowledge Distillation

RelaySum for Decentralized Deep Learning on Heterogeneous Data

1 code implementation NeurIPS 2021 Thijs Vogels, Lie He, Anastasia Koloskova, Tao Lin, Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi

A key challenge, primarily in decentralized deep learning, remains the handling of differences between the workers' local data distributions.

Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

1 code implementation9 Feb 2021 Tao Lin, Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi

In this paper, we investigate and identify the limitation of several decentralized optimization algorithms for different degrees of data heterogeneity.

Learning from History for Byzantine Robust Optimization

1 code implementation18 Dec 2020 Sai Praneeth Karimireddy, Lie He, Martin Jaggi

Secondly, we prove that even if the aggregation rules may succeed in limiting the influence of the attackers in a single round, the attackers can couple their attacks across time eventually leading to divergence.

Federated Learning Stochastic Optimization

Practical Low-Rank Communication Compression in Decentralized Deep Learning

1 code implementation NeurIPS 2020 Thijs Vogels, Sai Praneeth Karimireddy, Martin Jaggi

Lossy gradient compression has become a practical tool to overcome the communication bottleneck in centrally coordinated distributed training of machine learning models.

Byzantine-Robust Learning on Heterogeneous Datasets via Resampling

no code implementations28 Sep 2020 Lie He, Sai Praneeth Karimireddy, Martin Jaggi

In Byzantine-robust distributed optimization, a central server wants to train a machine learning model over data distributed across multiple workers.

Distributed Optimization

Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning

1 code implementation8 Aug 2020 Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh

Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which gives rise to the client drift phenomenon.

Federated Learning

PowerGossip: Practical Low-Rank Communication Compression in Decentralized Deep Learning

2 code implementations4 Aug 2020 Thijs Vogels, Sai Praneeth Karimireddy, Martin Jaggi

Lossy gradient compression has become a practical tool to overcome the communication bottleneck in centrally coordinated distributed training of machine learning models.

Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing

1 code implementation ICLR 2022 Sai Praneeth Karimireddy, Lie He, Martin Jaggi

In Byzantine robust distributed or federated learning, a central server wants to train a machine learning model over data distributed across multiple workers.

Distributed Optimization Federated Learning

Secure Byzantine-Robust Machine Learning

no code implementations8 Jun 2020 Lie He, Sai Praneeth Karimireddy, Martin Jaggi

Increasingly machine learning systems are being deployed to edge servers and devices (e. g. mobile phones) and trained in a collaborative manner.

Why are Adaptive Methods Good for Attention Models?

no code implementations NeurIPS 2020 Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J. Reddi, Sanjiv Kumar, Suvrit Sra

While stochastic gradient descent (SGD) is still the \emph{de facto} algorithm in deep learning, adaptive methods like Clipped SGD/Adam have been observed to outperform SGD across important tasks, such as attention models.

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

5 code implementations ICML 2020 Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh

We obtain tight convergence rates for FedAvg and prove that it suffers from `client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow convergence.

Distributed Optimization Federated Learning

Why ADAM Beats SGD for Attention Models

no code implementations25 Sep 2019 Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J Reddi, Sanjiv Kumar, Suvrit Sra

While stochastic gradient descent (SGD) is still the de facto algorithm in deep learning, adaptive methods like Adam have been observed to outperform SGD across important tasks, such as attention models.

The Error-Feedback Framework: Better Rates for SGD with Delayed Gradients and Compressed Communication

no code implementations11 Sep 2019 Sebastian U. Stich, Sai Praneeth Karimireddy

We analyze (stochastic) gradient descent (SGD) with delayed updates on smooth quasi-convex and non-convex functions and derive concise, non-asymptotic, convergence rates.

Amplifying Rényi Differential Privacy via Shuffling

no code implementations11 Jul 2019 Eloïse Berthier, Sai Praneeth Karimireddy

Differential privacy is a useful tool to build machine learning models which do not release too much information about the training data.

PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization

1 code implementation NeurIPS 2019 Thijs Vogels, Sai Praneeth Karimireddy, Martin Jaggi

We study gradient compression methods to alleviate the communication bottleneck in data-parallel distributed optimization.

Distributed Optimization

Accelerating Gradient Boosting Machine

1 code implementation20 Mar 2019 Haihao Lu, Sai Praneeth Karimireddy, Natalia Ponomareva, Vahab Mirrokni

This is the first GBM type of algorithm with theoretically-justified accelerated convergence rate.

Efficient Greedy Coordinate Descent for Composite Problems

no code implementations16 Oct 2018 Sai Praneeth Karimireddy, Anastasia Koloskova, Sebastian U. Stich, Martin Jaggi

For these problems we provide (i) the first linear rates of convergence independent of $n$, and show that our greedy update rule provides speedups similar to those obtained in the smooth case.

Global linear convergence of Newton's method without strong-convexity or Lipschitz gradients

no code implementations1 Jun 2018 Sai Praneeth Karimireddy, Sebastian U. Stich, Martin Jaggi

We show that Newton's method converges globally at a linear rate for objective functions whose Hessians are stable.

On Matching Pursuit and Coordinate Descent

no code implementations ICML 2018 Francesco Locatello, Anant Raj, Sai Praneeth Karimireddy, Gunnar Rätsch, Bernhard Schölkopf, Sebastian U. Stich, Martin Jaggi

Exploiting the connection between the two algorithms, we present a unified analysis of both, providing affine invariant sublinear $\mathcal{O}(1/t)$ rates on smooth objectives and linear convergence on strongly convex objectives.

Cannot find the paper you are looking for? You can Submit a new open access paper.