no code implementations • 4 Dec 2012 • Peter Richtárik, Martin Takáč
In this work we show that randomized (block) coordinate descent methods can be accelerated by parallelization when applied to the problem of minimizing the sum of a partially separable smooth convex function and a simple separable convex function.
1 code implementation • 17 Dec 2012 • Peter Richtárik, Majid Jahani, Selin Damla Ahipaşaoğlu, Martin Takáč
Given a multivariate data set, sparse principal component analysis (SPCA) aims to extract several linear combinations of the variables that together explain the variance in the data as much as possible, while controlling the number of nonzero loadings in these combinations.
no code implementations • 8 Oct 2013 • Peter Richtárik, Martin Takáč
In this paper we develop and analyze Hydra: HYbriD cooRdinAte descent method for solving loss minimization problems with big data.
no code implementations • 13 Oct 2013 • Peter Richtárik, Martin Takáč
We propose and analyze a new parallel coordinate descent method---`NSync---in which at each iteration a random subset of coordinates is updated, in parallel, allowing for the subsets to be chosen non-uniformly.
no code implementations • 4 Nov 2013 • Martin Takáč, Selin Damla Ahipaşaoğlu, Ngai-Man Cheung, Peter Richtárik
Our approach attacks the maximization problem in sparse PCA directly and is scalable to high-dimensional data.
no code implementations • 21 May 2014 • Olivier Fercoq, Zheng Qu, Peter Richtárik, Martin Takáč
We propose an efficient distributed randomized coordinate descent method for minimizing regularized non-strongly convex loss functions.
no code implementations • NeurIPS 2014 • Martin Jaggi, Virginia Smith, Martin Takáč, Jonathan Terhorst, Sanjay Krishnan, Thomas Hofmann, Michael. I. Jordan
Communication remains the most significant bottleneck in the performance of distributed optimization algorithms for large-scale machine learning.
no code implementations • 17 Oct 2014 • Jakub Konečný, Jie Liu, Peter Richtárik, Martin Takáč
Our method first performs a deterministic step (computation of the gradient of the objective function at the starting point), followed by a large number of stochastic steps.
no code implementations • 8 Feb 2015 • Zheng Qu, Peter Richtárik, Martin Takáč, Olivier Fercoq
We propose a new algorithm for minimizing regularized empirical loss: Stochastic Dual Newton Ascent (SDNA).
1 code implementation • 12 Feb 2015 • Chenxin Ma, Virginia Smith, Martin Jaggi, Michael. I. Jordan, Peter Richtárik, Martin Takáč
Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck.
no code implementations • 16 Apr 2015 • Jakub Konečný, Jie Liu, Peter Richtárik, Martin Takáč
Our method first performs a deterministic step (computation of the gradient of the objective function at the starting point), followed by a large number of stochastic steps.
no code implementations • 8 Jun 2015 • Chenxin Ma, Rachael Tappenden, Martin Takáč
We show that the famous SDCA algorithm for optimizing the SVM dual problem, or the stochastic coordinate descent method for the LASSO problem, fits into the framework of RC-FDM.
no code implementations • 29 Jul 2015 • Martin Takáč, Peter Richtárik, Nathan Srebro
We present an improved analysis of mini-batched stochastic dual coordinate ascent for regularized empirical loss minimization (i. e. SVM and SVM-type objectives).
no code implementations • 22 Oct 2015 • Chenxin Ma, Martin Takáč
In this paper we study the effect of the way that the data is partitioned in distributed optimization.
no code implementations • 22 Oct 2015 • Xi He, Martin Takáč
This work is motivated by recent work of Shai Shalev-Shwartz on dual free SDCA method, however, we allow a non-uniform selection of "dual" coordinates in SDCA.
1 code implementation • 13 Dec 2015 • Chenxin Ma, Jakub Konečný, Martin Jaggi, Virginia Smith, Michael. I. Jordan, Peter Richtárik, Martin Takáč
To this end, we present a framework for distributed optimization that both allows the flexibility of arbitrary solvers to be used on each (single) machine locally, and yet maintains competitive performance against other state-of-the-art special-purpose distributed methods.
no code implementations • 16 Feb 2016 • Celestine Dünner, Simone Forte, Martin Takáč, Martin Jaggi
We propose an algorithm-independent framework to equip existing optimization methods with primal-dual certificates.
no code implementations • 16 Mar 2016 • Chenxin Ma, Martin Takáč
In this paper we study inexact dumped Newton method implemented in a distributed environment.
no code implementations • NeurIPS 2016 • Albert S. Berahas, Jorge Nocedal, Martin Takáč
The question of how to parallelize the stochastic gradient descent (SGD) method has received much attention in the literature.
no code implementations • 2 Jun 2016 • Xi He, Dheevatsa Mudigere, Mikhail Smelyanskiy, Martin Takáč
Training deep neural network is a high dimensional and a highly non-convex optimization problem.
2 code implementations • 7 Jul 2016 • Afshin Oroojlooyjadid, Lawrence Snyder, Martin Takáč
However, approximating the probability distribution is not easy and is prone to error; therefore, the resulting solution to the newsvendor problem may be not optimal.
no code implementations • ICML 2017 • Lam M. Nguyen, Jie Liu, Katya Scheinberg, Martin Takáč
In this paper, we propose a StochAstic Recursive grAdient algoritHm (SARAH), as well as its practical variant SARAH+, as a novel approach to the finite-sum minimization problems.
no code implementations • 20 May 2017 • Lam M. Nguyen, Jie Liu, Katya Scheinberg, Martin Takáč
In this paper, we study and analyze the mini-batch version of StochAstic Recursive grAdient algoritHm (SARAH), a method employing the stochastic recursive gradient, for solving empirical loss minimization for the case of nonconvex losses.
no code implementations • 4 Jun 2017 • Peter Richtárik, Martin Takáč
We develop a family of reformulations of an arbitrary consistent linear system into a stochastic problem.
no code implementations • 26 Jul 2017 • Albert S. Berahas, Martin Takáč
This paper describes an implementation of the L-BFGS method designed to deal with two adversarial situations.
no code implementations • 20 Aug 2017 • Afshin Oroojlooyjadid, MohammadReza Nazari, Lawrence Snyder, Martin Takáč
The game is a decentralized, multi-agent, cooperative problem that can be modeled as a serial supply chain network in which agents cooperatively attempt to minimize the total cost of the network even though each agent can only observe its own local information.
no code implementations • 20 Sep 2017 • Afshin Oroojlooyjadid, Lawrence Snyder, Martin Takáč
In multi-echelon inventory systems the performance of a given node is affected by events that occur at many other nodes and in many other time periods.
no code implementations • 10 Oct 2017 • Majid Jahani, Naga Venkata C. Gudapati, Chenxin Ma, Rachael Tappenden, Martin Takáč
In this work we introduce the concept of an Underestimate Sequence (UES), which is motivated by Nesterov's estimate sequence.
1 code implementation • 14 Nov 2017 • Chenxin Ma, Martin Jaggi, Frank E. Curtis, Nathan Srebro, Martin Takáč
In this paper, an accelerated variant of CoCoA+ is proposed and shown to possess a convergence rate of $\mathcal{O}(1/t^2)$ in terms of reducing suboptimality.
no code implementations • ICML 2018 • Lam M. Nguyen, Phuong Ha Nguyen, Marten van Dijk, Peter Richtárik, Katya Scheinberg, Martin Takáč
In (Bottou et al., 2016), a new analysis of convergence of SGD is performed under the assumption that stochastic gradients are bounded with respect to the true gradient norm.
4 code implementations • NeurIPS 2018 • Mohammadreza Nazari, Afshin Oroojlooy, Lawrence V. Snyder, Martin Takáč
Our model represents a parameterized stochastic policy, and by applying a policy gradient algorithm to optimize its parameters, the trained model produces the solution as a sequence of consecutive actions in real time, without the need to re-train for every new problem instance.
no code implementations • 28 Mar 2018 • Krishnan Kumaran, Dimitri Papageorgiou, Yutong Chang, Minhan Li, Martin Takáč
We present mixed-integer optimization approaches to find optimal distance metrics that generalize the Mahalanobis metric extensively studied in the literature.
no code implementations • 26 Oct 2018 • Majid Jahani, Xi He, Chenxin Ma, Aryan Mokhtari, Dheevatsa Mudigere, Alejandro Ribeiro, Martin Takáč
In this paper, we propose a Distributed Accumulated Newton Conjugate gradiEnt (DANCE) method in which sample size is gradually increasing to quickly obtain a solution whose empirical loss is under satisfactory statistical accuracy.
no code implementations • 10 Nov 2018 • Lam M. Nguyen, Phuong Ha Nguyen, Peter Richtárik, Katya Scheinberg, Martin Takáč, Marten van Dijk
We show the convergence of SGD for strongly convex objective function without using bounded gradient assumption when $\{\eta_t\}$ is a diminishing sequence and $\sum_{t=0}^\infty \eta_t \rightarrow \infty$.
no code implementations • 25 Nov 2018 • Lam M. Nguyen, Katya Scheinberg, Martin Takáč
We develop and analyze a variant of the SARAH algorithm, which does not require computation of the exact gradient.
no code implementations • 26 Jan 2019 • Konstantin Mishchenko, Eduard Gorbunov, Martin Takáč, Peter Richtárik
Our analysis of block-quantization and differences between $\ell_2$ and $\ell_{\infty}$ quantization closes the gaps in theory and practice.
1 code implementation • 28 Jan 2019 • Albert S. Berahas, Majid Jahani, Peter Richtárik, Martin Takáč
We present two sampled quasi-Newton methods (sampled LBFGS and sampled LSR1) for solving empirical risk minimization problems that arise in machine learning.
1 code implementation • 13 May 2019 • Hossein K. Mousavi, MohammadReza Nazari, Martin Takáč, Nader Motee
We investigate a classification problem using multiple mobile agents capable of collecting (partial) pose-dependent observations of an unknown environment.
no code implementations • 30 May 2019 • Majid Jahani, MohammadReza Nazari, Sergey Rusakov, Albert S. Berahas, Martin Takáč
In this paper, we present a scalable distributed implementation of the Sampled Limited-memory Symmetric Rank-1 (S-LSR1) algorithm.
no code implementations • 30 May 2019 • Mohammadreza Nazari, Majid Jahani, Lawrence V. Snyder, Martin Takáč
Therefore, we propose a student-teacher RL mechanism in which the RL (the "student") learns to maximize its reward, subject to a constraint that bounds the difference between the RL policy and the "teacher" policy.
no code implementations • 20 Sep 2019 • Hossein K. Mousavi, Guangyi Liu, Weihang Yuan, Martin Takáč, Héctor Muñoz-Avila, Nader Motee
We propose a planning and perception mechanism for a robot (agent), that can only observe the underlying environment partially, in order to solve an image classification problem.
no code implementations • 28 Oct 2019 • Nur Sila Gulgec, Zheng Shi, Neil Deshmukh, Shamim Pakzad, Martin Takáč
Discovering the underlying physical behavior of complex systems is a crucial, but less well-understood topic in many engineering disciplines.
no code implementations • 20 Dec 2019 • Sélim Chraibi, Ahmed Khaled, Dmitry Kovalev, Peter Richtárik, Adil Salim, Martin Takáč
We propose basic and natural assumptions under which iterative optimization methods with compressed iterates can be analyzed.
no code implementations • 2 Jun 2020 • Zheng Shi, Nur Sila Gulgec, Albert S. Berahas, Shamim N. Pakzad, Martin Takáč
Discovering the underlying behavior of complex systems is an important topic in many science and engineering disciplines.
no code implementations • 6 Jun 2020 • Majid Jahani, MohammadReza Nazari, Rachael Tappenden, Albert S. Berahas, Martin Takáč
This work presents a new algorithm for empirical risk minimization.
no code implementations • 22 Jun 2020 • Ruben Solozabal, Josu Ceberio, Martin Takáč
This paper presents a framework to tackle constrained combinatorial optimization problems using deep Reinforcement Learning (RL).
no code implementations • 3 Jul 2020 • Soheil Sadeghi Eshkevari, Martin Takáč, Shamim N. Pakzad, Majid Jahani
Data-driven models for predicting dynamic responses of linear and nonlinear systems are of great importance due to their wide application from probabilistic analysis to inverse problems such as system identification and damage diagnosis.
no code implementations • 18 Dec 2020 • Guangyi Liu, Arash Amini, Martin Takáč, Héctor Muñoz-Avila, Nader Motee
We consider the problem of classifying a map using a team of communicating robots.
no code implementations • 19 Feb 2021 • Zheng Shi, Abdurakhmon Sadiev, Nicolas Loizou, Peter Richtárik, Martin Takáč
We present AI-SARAH, a practical variant of SARAH.
no code implementations • 14 Jun 2021 • Ekaterina Borodich, Aleksandr Beznosikov, Abdurakhmon Sadiev, Vadim Sushko, Nikolay Savelyev, Martin Takáč, Alexander Gasnikov
Personalized Federated Learning (PFL) has witnessed remarkable advancements, enabling the development of innovative machine learning applications that preserve the privacy of training data.
no code implementations • ICLR 2022 • Majid Jahani, Sergey Rusakov, Zheng Shi, Peter Richtárik, Michael W. Mahoney, Martin Takáč
We present a novel adaptive optimization algorithm for large-scale machine learning problems.
no code implementations • 26 Nov 2021 • Aleksandr Beznosikov, Martin Takáč
The StochAstic Recursive grAdient algoritHm (SARAH) algorithm is a variance reduced variant of the Stochastic Gradient Descent (SGD) algorithm that needs a gradient of the objective function from time to time.
no code implementations • 1 Jun 2022 • Abdurakhmon Sadiev, Aleksandr Beznosikov, Abdulla Jasem Almansoori, Dmitry Kamzolov, Rachael Tappenden, Martin Takáč
There are several algorithms for such problems, but existing methods often work poorly when the problem is badly scaled and/or ill-conditioned, and a primary goal of this work is to introduce methods that alleviate this issue.
no code implementations • 3 Jun 2022 • Egor Gladin, Maksim Lavrik-Karmazin, Karina Zainullina, Varvara Rudenko, Alexander Gasnikov, Martin Takáč
The problem of constrained Markov decision process is considered.
no code implementations • 16 Jun 2022 • Aleksandr Beznosikov, Aibek Alanov, Dmitry Kovalev, Martin Takáč, Alexander Gasnikov
Methods with adaptive scaling of different features play a key role in solving saddle point problems, primarily due to Adam's popularity for solving adversarial machine learning problems, including GANS training.
1 code implementation • 15 Jul 2022 • Naif Alkhunaizi, Dmitry Kamzolov, Martin Takáč, Karthik Nandakumar
Federated Learning (FL) is a promising solution that enables collaborative training through exchange of model parameters instead of raw data.
no code implementations • 17 Jul 2022 • Shuang Li, William J. Swartworth, Martin Takáč, Deanna Needell, Robert M. Gower
We take a step further and develop a method for solving the interpolation equations that uses the local second-order approximation of the model.
no code implementations • 18 Oct 2022 • Artem Agafonov, Brahim Erraji, Martin Takáč
In the recent paper FLECS (Agafonov et al, FLECS: A Federated Learning Second-Order Framework via Compression and Sketching), the second-order framework FLECS was proposed for the Federated Learning problem.
no code implementations • 2 Nov 2022 • Rachael Tappenden, Martin Takáč
This work shows that applying Gradient Descent (GD) with a fixed step size to minimize a (possibly nonconvex) quadratic function is equivalent to running the Power Method (PM) on the gradients.
1 code implementation • 7 Dec 2022 • Abdulla Jasem Almansoori, Samuel Horváth, Martin Takáč
Federated learning has become a popular machine learning paradigm with many potential real-life applications, including recommendation systems, the Internet of Things (IoT), healthcare, and self-driving cars.
no code implementations • 2 Jan 2023 • Asma Ahmed Hashmi, Aigerim Zhumabayeva, Nikita Kotelevskii, Artem Agafonov, Mohammad Yaqub, Maxim Panov, Martin Takáč
We evaluate the proposed method on a series of classification tasks such as noisy versions of MNIST, CIFAR-10, Fashion-MNIST datasets as well as CIFAR-10N, which is real-world dataset with noisy human annotations.
1 code implementation • 8 Apr 2023 • Dávid Šuba, Marek Šuppa, Jozef Kubík, Endre Hamerlik, Martin Takáč
Named Entity Recognition (NER) is a fundamental NLP tasks with a wide range of practical applications.
1 code implementation • 3 Oct 2023 • Farshed Abdukhakimov, Chulu Xiang, Dmitry Kamzolov, Martin Takáč
Stochastic Gradient Descent (SGD) is one of the many iterative optimization methods that are widely used in solving machine learning problems.
no code implementations • 18 Dec 2023 • Nikita Kotelevskii, Samuel Horváth, Karthik Nandakumar, Martin Takáč, Maxim Panov
This paper presents a new approach to federated learning that allows selecting a model from global and personalized ones that would perform better for a particular input point.
1 code implementation • 28 Dec 2023 • Farshed Abdukhakimov, Chulu Xiang, Dmitry Kamzolov, Robert Gower, Martin Takáč
Adaptive optimization methods are widely recognized as among the most popular approaches for training Deep Neural Networks (DNNs).
no code implementations • 7 Feb 2024 • Nazarii Tupitsa, Samuel Horváth, Martin Takáč, Eduard Gorbunov
In Federated Learning (FL), the distributed nature and heterogeneity of client data present both opportunities and challenges.
no code implementations • 7 Feb 2024 • Petr Ostroukhov, Aigerim Zhumabayeva, Chulu Xiang, Alexander Gasnikov, Martin Takáč, Dmitry Kamzolov
To substantiate the efficacy of our method, we experimentally show, how the introduction of adaptive step size and adaptive batch size gradually improves the performance of regular SGD.
no code implementations • 27 Mar 2024 • Nicolas Mauricio Cuadrado, Roberto Alejandro Gutierrez, Martin Takáč
The rise in renewable energy is creating new dynamics in the energy grid that promise to create a cleaner and more participative energy grid, where technology plays a crucial part in making the required flexibility to achieve the vision of the next-generation grid.
no code implementations • 27 Mar 2024 • Yunxiang Li, Nicolas Mauricio Cuadrado, Samuel Horváth, Martin Takáč
The smart grid domain requires bolstering the capabilities of existing energy management systems; Federated Learning (FL) aligns with this goal as it demonstrates a remarkable ability to train models on heterogeneous datasets while maintaining data privacy, making it suitable for smart grid applications, which often involve disparate data distributions and interdependencies among features that hinder the suitability of linear models.