no code implementations • 29 Nov 2023 • Kumar Kshitij Patel, Lingxiao Wang, Aadirupa Saha, Nati Sebro
Furthermore, we delve into the more challenging setting of federated online optimization with bandit (zeroth-order) feedback, where the machines can only access values of the cost functions at the queried points.
no code implementations • 28 Nov 2023 • Minbiao Han, Kumar Kshitij Patel, Han Shao, Lingxiao Wang
Federated learning is a machine learning protocol that enables a large population of agents to collaborate over multiple rounds to produce a single consensus model.
no code implementations • NeurIPS 2021 • Brian Bullins, Kumar Kshitij Patel, Ohad Shamir, Nathan Srebro, Blake Woodworth
We propose and analyze a stochastic Newton algorithm for homogeneous distributed stochastic convex optimization, where each machine can calculate stochastic gradients of the same population objective, as well as stochastic Hessian-vector products (products of an independent unbiased estimator of the Hessian of the population objective with arbitrary vectors), with many such stochastic computations performed between rounds of communication.
no code implementations • NeurIPS 2020 • Blake Woodworth, Kumar Kshitij Patel, Nathan Srebro
the average objective; and machines can only communicate intermittently.
no code implementations • ICML 2020 • Blake Woodworth, Kumar Kshitij Patel, Sebastian U. Stich, Zhen Dai, Brian Bullins, H. Brendan McMahan, Ohad Shamir, Nathan Srebro
We study local SGD (also known as parallel SGD and federated averaging), a natural and frequently used stochastic distributed optimization method.
1 code implementation • NeurIPS 2019 • Aymeric Dieuleveut, Kumar Kshitij Patel
Synchronous mini-batch SGD is state-of-the-art for large-scale distributed machine learning.
no code implementations • 25 Apr 2019 • Kumar Kshitij Patel, Aymeric Dieuleveut
Synchronous mini-batch SGD is state-of-the-art for large-scale distributed machine learning.
2 code implementations • ICLR 2020 • Tao Lin, Sebastian U. Stich, Kumar Kshitij Patel, Martin Jaggi
Mini-batch stochastic gradient methods (SGD) are state of the art for distributed training of deep neural networks.