Search Results for author: Zachary Charles

Found 19 papers, 8 papers with code

Iterated Vector Fields and Conservatism, with Applications to Federated Learning

no code implementations8 Sep 2021 Zachary Charles, Keith Rush

We analyze the conservatism of various iterated vector fields, including gradient vector fields associated to loss functions of generalized linear models.

Federated Learning

Local Adaptivity in Federated Learning: Convergence and Consistency

no code implementations4 Jun 2021 Jianyu Wang, Zheng Xu, Zachary Garrett, Zachary Charles, Luyang Liu, Gauri Joshi

Popular optimization algorithms of FL use vanilla (stochastic) gradient descent for both local updates at clients and global updates at the aggregating server.

Federated Learning

Convergence and Accuracy Trade-Offs in Federated Learning and Meta-Learning

no code implementations8 Mar 2021 Zachary Charles, Jakub Konečný

Using these insights, we are able to compare local update methods based on their convergence/accuracy trade-off, not just their convergence to critical points of the empirical loss.

Federated Learning Meta-Learning

On the Outsized Importance of Learning Rates in Local Update Methods

1 code implementation2 Jul 2020 Zachary Charles, Jakub Konečný

We study a family of algorithms, which we refer to as local update methods, that generalize many federated learning and meta-learning algorithms.

Federated Learning Meta-Learning

Adaptive Federated Optimization

2 code implementations ICLR 2021 Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, H. Brendan McMahan

Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data.

Federated Learning

DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation

1 code implementation NeurIPS 2019 Shashank Rajput, Hongyi Wang, Zachary Charles, Dimitris Papailiopoulos

In this work, we present DETOX, a Byzantine-resilient distributed training framework that combines algorithmic redundancy with robust aggregation.

Convergence and Margin of Adversarial Training on Separable Data

no code implementations22 May 2019 Zachary Charles, Shashank Rajput, Stephen Wright, Dimitris Papailiopoulos

Our results are derived by showing that adversarial training with gradient updates minimizes a robust version of the empirical risk at a $\mathcal{O}(\ln(t)^2/t)$ rate, despite non-smoothness.

Does Data Augmentation Lead to Positive Margin?

no code implementations8 May 2019 Shashank Rajput, Zhili Feng, Zachary Charles, Po-Ling Loh, Dimitris Papailiopoulos

Data augmentation (DA) is commonly used during model training, as it significantly improves test error and model robustness.

Data Augmentation

ErasureHead: Distributed Gradient Descent without Delays Using Approximate Gradient Coding

1 code implementation28 Jan 2019 Hongyi Wang, Zachary Charles, Dimitris Papailiopoulos

We present ErasureHead, a new approach for distributed gradient descent (GD) that mitigates system delays by employing approximate gradient coding.

A Geometric Perspective on the Transferability of Adversarial Directions

no code implementations8 Nov 2018 Zachary Charles, Harrison Rosenberg, Dimitris Papailiopoulos

We show that these "transferable adversarial directions" are guaranteed to exist for linear separators of a given set, and will exist with high probability for linear classifiers trained on independent sets drawn from the same distribution.

Gradient Coding via the Stochastic Block Model

no code implementations25 May 2018 Zachary Charles, Dimitris Papailiopoulos

Gradient descent and its many variants, including mini-batch stochastic gradient descent, form the algorithmic foundation of modern large-scale machine learning.

Stochastic Block Model

DRACO: Byzantine-resilient Distributed Training via Redundant Gradients

1 code implementation ICML 2018 Lingjiao Chen, Hongyi Wang, Zachary Charles, Dimitris Papailiopoulos

Distributed model training is vulnerable to byzantine system failures and adversarial compute nodes, i. e., nodes that use malicious updates to corrupt the global model stored at a parameter server (PS).

Approximate Gradient Coding via Sparse Random Graphs

no code implementations17 Nov 2017 Zachary Charles, Dimitris Papailiopoulos, Jordan Ellenberg

Distributed algorithms are often beset by the straggler effect, where the slowest compute nodes in the system dictate the overall running time.

Stability and Generalization of Learning Algorithms that Converge to Global Optima

no code implementations ICML 2018 Zachary Charles, Dimitris Papailiopoulos

Finally, we show that although our results imply comparable stability for SGD and GD in the PL setting, there exist simple neural networks with multiple local minima where SGD is stable but GD is not.

Generalization Bounds

Subspace Clustering with Missing and Corrupted Data

no code implementations8 Jul 2017 Zachary Charles, Amin Jalali, Rebecca Willett

Given full or partial information about a collection of points that lie close to a union of several subspaces, subspace clustering refers to the process of clustering the points according to their subspace and identifying the subspaces.

Cannot find the paper you are looking for? You can Submit a new open access paper.