Search Results for author: Giorgi Nadiradze

Found 8 papers, 2 papers with code

Hybrid Decentralized Optimization: First- and Zeroth-Order Optimizers Can Be Jointly Leveraged For Faster Convergence

no code implementations14 Oct 2022 Shayan Talaei, Giorgi Nadiradze, Dan Alistarh

Distributed optimization has become one of the standard ways of speeding up machine learning training, and most of the research in the area focuses on distributed first-order, gradient-based methods.

Distributed Optimization

Communication-Efficient Federated Learning With Data and Client Heterogeneity

no code implementations20 Jun 2022 Hossein Zakerinia, Shayan Talaei, Giorgi Nadiradze, Dan Alistarh

Federated Learning (FL) enables large-scale distributed training of machine learning models, while still allowing individual nodes to maintain data locally.

Federated Learning

Efficiency Guarantees for Parallel Incremental Algorithms under Relaxed Schedulers

1 code implementation20 Mar 2020 Dan Alistarh, Nikita Koval, Giorgi Nadiradze

We show that, for algorithms such as Delaunay mesh triangulation and sorting by insertion, schedulers with a maximum relaxation factor of $k$ in terms of the maximum priority inversion allowed will introduce a maximum amount of wasted work of $O(log(n) poly (k) ), $ where $n$ is the number of tasks to be executed.

Data Structures and Algorithms Distributed, Parallel, and Cluster Computing

Elastic Consistency: A General Consistency Model for Distributed Stochastic Gradient Descent

no code implementations16 Jan 2020 Giorgi Nadiradze, Ilia Markov, Bapi Chatterjee, Vyacheslav Kungurtsev, Dan Alistarh

Our framework, called elastic consistency enables us to derive convergence bounds for a variety of distributed SGD methods used in practice to train large-scale machine learning models.

BIG-bench Machine Learning

Asynchronous Decentralized SGD with Quantized and Local Updates

no code implementations NeurIPS 2021 Giorgi Nadiradze, Amirmojtaba Sabour, Peter Davies, Shigang Li, Dan Alistarh

Perhaps surprisingly, we show that a variant of SGD called \emph{SwarmSGD} still converges in this setting, even if \emph{non-blocking communication}, \emph{quantization}, and \emph{local steps} are all applied \emph{in conjunction}, and even if the node data distributions and underlying graph topology are both \emph{heterogenous}.

Blocking Distributed Optimization +2

PopSGD: Decentralized Stochastic Gradient Descent in the Population Model

no code implementations25 Sep 2019 Giorgi Nadiradze, Amirmojtaba Sabour, Aditya Sharma, Ilia Markov, Vitaly Aksenov, Dan Alistarh.

We prove that, under standard assumptions, SGD can converge even in this extremely loose, decentralized setting, for both convex and non-convex objectives.

Distributed Optimization Scheduling

The Power of Choice in Priority Scheduling

1 code implementation13 Jun 2017 Dan Alistarh, Justin Kopinsky, Jerry Li, Giorgi Nadiradze

We answer this question, showing that this strategy provides surprisingly strong guarantees: Although the single-choice process, where we always insert and remove from a single randomly chosen queue, has degrading cost, going to infinity as we increase the number of steps, in the two choice process, the expected rank of a removed element is $O( n )$ while the expected worst-case cost is $O( n \log n )$.

Data Structures and Algorithms Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.