Distributed Optimization

77 papers with code • 0 benchmarks • 0 datasets

The goal of Distributed Optimization is to optimize a certain objective defined over millions of billions of data that is distributed over many machines by utilizing the computational power of these machines.

Source: Analysis of Distributed StochasticDual Coordinate Ascent

Libraries

Use these libraries to find Distributed Optimization models and implementations
3 papers
90
2 papers
4,212
2 papers
175

Most implemented papers

Optimal algorithms for smooth and strongly convex distributed optimization in networks

adelnabli/dadao ICML 2017

For centralized (i. e. master/slave) algorithms, we show that distributing Nesterov's accelerated gradient descent is optimal and achieves a precision $\varepsilon > 0$ in time $O(\sqrt{\kappa_g}(1+\Delta\tau)\ln(1/\varepsilon))$, where $\kappa_g$ is the condition number of the (global) function to optimize, $\Delta$ is the diameter of the network, and $\tau$ (resp.

An Accelerated Communication-Efficient Primal-Dual Optimization Framework for Structured Machine Learning

schemmy/CoCoA-Experiments 14 Nov 2017

In this paper, an accelerated variant of CoCoA+ is proposed and shown to possess a convergence rate of $\mathcal{O}(1/t^2)$ in terms of reducing suboptimality.

A Distributed Quasi-Newton Algorithm for Empirical Risk Minimization with Nonsmooth Regularization

leepei/dplbfgs 4 Mar 2018

Initial computational results on convex problems demonstrate that our method significantly improves on communication cost and running time over the current state-of-the-art methods.

Sparsified SGD with Memory

epfml/sparsifiedSGD NeurIPS 2018

Huge scale machine learning problems are nowadays tackled by distributed optimization algorithms, i. e. algorithms that leverage the compute power of many devices for training.

OverSketched Newton: Fast Convex Optimization for Serverless Systems

vvipgupta/OverSketchedNewton 21 Mar 2019

Motivated by recent developments in serverless systems for large-scale computation as well as improvements in scalable randomized matrix algorithms, we develop OverSketched Newton, a randomized Hessian-based optimization algorithm to solve large-scale convex optimization problems in serverless systems.

Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models: Extension

yunfei-teng/LSGD NeurIPS 2019

Finally, we implement an asynchronous version of our algorithm and extend it to the multi-leader setting, where we form groups of workers, each represented by its own local leader (the best performer in a group), and update each worker with a corrective direction comprised of two attractive forces: one to the local, and one to the global leader (the best performer among all workers).

PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization

epfml/powersgd NeurIPS 2019

We study gradient compression methods to alleviate the communication bottleneck in data-parallel distributed optimization.

Trading Redundancy for Communication: Speeding up Distributed SGD for Non-convex Optimization

mmkamani7/RI-SGD International Conference on Machine Learning 2019

Communication overhead is one of the key challenges that hinder the scalability of distributed optimization algorithms to train large neural networks.

Federated Learning: Challenges, Methods, and Future Directions

AshwinRJ/Federated-Learning-PyTorch 21 Aug 2019

Federated learning involves training statistical models over remote devices or siloed data centers, such as mobile phones or hospitals, while keeping data localized.

Communication-Efficient Distributed Optimization in Networks with Gradient Tracking and Variance Reduction

liboyue/Network-Distributed-Algorithm 12 Sep 2019

There is growing interest in large-scale machine learning and optimization over decentralized networks, e. g. in the context of multi-agent learning and federated learning.