Distributed Optimization
77 papers with code • 0 benchmarks • 0 datasets
The goal of Distributed Optimization is to optimize a certain objective defined over millions of billions of data that is distributed over many machines by utilizing the computational power of these machines.
Source: Analysis of Distributed StochasticDual Coordinate Ascent
Benchmarks
These leaderboards are used to track progress in Distributed Optimization
Libraries
Use these libraries to find Distributed Optimization models and implementationsMost implemented papers
Optimal algorithms for smooth and strongly convex distributed optimization in networks
For centralized (i. e. master/slave) algorithms, we show that distributing Nesterov's accelerated gradient descent is optimal and achieves a precision $\varepsilon > 0$ in time $O(\sqrt{\kappa_g}(1+\Delta\tau)\ln(1/\varepsilon))$, where $\kappa_g$ is the condition number of the (global) function to optimize, $\Delta$ is the diameter of the network, and $\tau$ (resp.
An Accelerated Communication-Efficient Primal-Dual Optimization Framework for Structured Machine Learning
In this paper, an accelerated variant of CoCoA+ is proposed and shown to possess a convergence rate of $\mathcal{O}(1/t^2)$ in terms of reducing suboptimality.
A Distributed Quasi-Newton Algorithm for Empirical Risk Minimization with Nonsmooth Regularization
Initial computational results on convex problems demonstrate that our method significantly improves on communication cost and running time over the current state-of-the-art methods.
Sparsified SGD with Memory
Huge scale machine learning problems are nowadays tackled by distributed optimization algorithms, i. e. algorithms that leverage the compute power of many devices for training.
OverSketched Newton: Fast Convex Optimization for Serverless Systems
Motivated by recent developments in serverless systems for large-scale computation as well as improvements in scalable randomized matrix algorithms, we develop OverSketched Newton, a randomized Hessian-based optimization algorithm to solve large-scale convex optimization problems in serverless systems.
Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: Extension
Finally, we implement an asynchronous version of our algorithm and extend it to the multi-leader setting, where we form groups of workers, each represented by its own local leader (the best performer in a group), and update each worker with a corrective direction comprised of two attractive forces: one to the local, and one to the global leader (the best performer among all workers).
PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization
We study gradient compression methods to alleviate the communication bottleneck in data-parallel distributed optimization.
Trading Redundancy for Communication: Speeding up Distributed SGD for Non-convex Optimization
Communication overhead is one of the key challenges that hinder the scalability of distributed optimization algorithms to train large neural networks.
Federated Learning: Challenges, Methods, and Future Directions
Federated learning involves training statistical models over remote devices or siloed data centers, such as mobile phones or hospitals, while keeping data localized.
Communication-Efficient Distributed Optimization in Networks with Gradient Tracking and Variance Reduction
There is growing interest in large-scale machine learning and optimization over decentralized networks, e. g. in the context of multi-agent learning and federated learning.