Search Results for author: Peter Richtarik

Found 20 papers, 2 papers with code

From Local SGD to Local Fixed Point Methods for Federated Learning

no code implementations ICML 2020 Grigory Malinovsky, Dmitry Kovalev, Elnur Gasanov, Laurent Condat, Peter Richtarik

Most algorithms for solving optimization problems or finding saddle points of convex-concave functions are fixed point algorithms.

Federated Learning

Acceleration for Compressed Gradient Descent in Distributed Optimization

no code implementations ICML 2020 Zhize Li, Dmitry Kovalev, Xun Qian, Peter Richtarik

Due to the high communication cost in distributed and federated learning problems, methods relying on sparsification or quantization of communicated messages are becoming increasingly popular.

Distributed Optimization Federated Learning +1

Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks

no code implementations NeurIPS 2021 Dmitry Kovalev, Elnur Gasanov, Alexander Gasnikov, Peter Richtarik

We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network whose links are allowed to change in time.

Optimal Client Sampling for Federated Learning

1 code implementation NeurIPS 2021 Wenlin Chen, Samuel Horvath, Peter Richtarik

We show that importance can be measured using only the norm of the update and give a formula for optimal client participation.

Federated Learning

Variance-Reduced Methods for Machine Learning

no code implementations2 Oct 2020 Robert M. Gower, Mark Schmidt, Francis Bach, Peter Richtarik

Stochastic optimization lies at the heart of machine learning, and its cornerstone is stochastic gradient descent (SGD), a method introduced over 60 years ago.

Stochastic Optimization

Adaptive Learning of the Optimal Batch Size of SGD

no code implementations3 May 2020 Motasem Alfarra, Slavomir Hanzely, Alyazeed Albasyoni, Bernard Ghanem, Peter Richtarik

Recent advances in the theoretical understanding of SGD led to a formula for the optimal batch size minimizing the number of effective data passes, i. e., the number of iterations times the batch size.

RSN: Randomized Subspace Newton

no code implementations NeurIPS 2019 Robert Gower, Dmitry Koralev, Felix Lieder, Peter Richtarik

We develop a randomized Newton method capable of solving learning problems with huge dimensional feature spaces, which is a common setting in applications such as medical imaging, genomics and seismology.

Natural Compression for Distributed Deep Learning

no code implementations27 May 2019 Samuel Horvath, Chen-Yu Ho, Ludovit Horvath, Atal Narayan Sahu, Marco Canini, Peter Richtarik

Our technique is applied individually to all entries of the to-be-compressed update vector and works by randomized rounding to the nearest (negative or positive) power of two, which can be computed in a "natural" way by ignoring the mantissa.

Quantization

SGD: General Analysis and Improved Rates

no code implementations27 Jan 2019 Robert Mansel Gower, Nicolas Loizou, Xun Qian, Alibek Sailanbayev, Egor Shulgin, Peter Richtarik

By specializing our theorem to different mini-batching strategies, such as sampling with replacement and independent sampling, we derive exact expressions for the stepsize as a function of the mini-batch size.

Don't Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop

no code implementations24 Jan 2019 Dmitry Kovalev, Samuel Horvath, Peter Richtarik

A key structural element in both of these methods is the inclusion of an outer loop at the beginning of which a full pass over the training data is made in order to compute the exact gradient, which is then used to construct a variance-reduced estimator of the gradient.

SEGA: Variance Reduction via Gradient Sketching

no code implementations NeurIPS 2018 Filip Hanzely, Konstantin Mishchenko, Peter Richtarik

In each iteration, SEGA updates the current estimate of the gradient through a sketch-and-project operation using the information provided by the latest sketch, and this is subsequently used to compute an unbiased estimate of the true gradient through a random relaxation procedure.

Randomized Block Cubic Newton Method

no code implementations ICML 2018 Nikita Doikov, Peter Richtarik, University Edinburgh

To this effect we propose and analyze a randomized block cubic Newton (RBCN) method, which in each iteration builds a model of the objective function formed as the sum of the natural models of its three components: a linear model with a quadratic regularizer for the differentiable term, a quadratic model with a cubic regularizer for the twice differentiable term, and perfect (proximal) model for the nonsmooth term.

Weighted Low-Rank Approximation of Matrices and Background Modeling

no code implementations15 Apr 2018 Aritra Dutta, Xin Li, Peter Richtarik

We primarily study a special a weighted low-rank approximation of matrices and then apply it to solve the background modeling problem.

Frame

Online and Batch Supervised Background Estimation via L1 Regression

no code implementations23 Nov 2017 Aritra Dutta, Peter Richtarik

We propose a surprisingly simple model for supervised video background estimation.

Quartz: Randomized Dual Coordinate Ascent with Arbitrary Sampling

no code implementations NeurIPS 2015 Zheng Qu, Peter Richtarik, Tong Zhang

We study the problem of minimizing the average of a large number of smooth convex functions penalized with a strongly convex regularizer.

Matrix Completion under Interval Uncertainty

no code implementations11 Aug 2014 Jakub Marecek, Peter Richtarik, Martin Takac

Matrix completion under interval uncertainty can be cast as matrix completion with element-wise box constraints.

Collaborative Filtering Matrix Completion

Separable Approximations and Decomposition Methods for the Augmented Lagrangian

no code implementations30 Aug 2013 Rachael Tappenden, Peter Richtarik, Burak Buke

In this paper we study decomposition methods based on separable approximations for minimizing the augmented Lagrangian.

Cannot find the paper you are looking for? You can Submit a new open access paper.