Search Results for author: Mert Gurbuzbalaban

Found 18 papers, 3 papers with code

Accelerated gradient methods for nonconvex optimization: Escape trajectories from strict saddle points and convergence to local minima

no code implementations13 Jul 2023 Rishabh Dixit, Mert Gurbuzbalaban, Waheed U. Bajwa

This work also develops two metrics of asymptotic rate of convergence and divergence, and evaluates these two metrics for several popular standard accelerated methods such as the NAG, and Nesterov's accelerated gradient with constant momentum (NCM) near strict saddle points.

Heavy-Tail Phenomenon in Decentralized SGD

no code implementations13 May 2022 Mert Gurbuzbalaban, Yuanhan Hu, Umut Simsekli, Kun Yuan, Lingjiong Zhu

To have a more explicit control on the tail exponent, we then consider the case where the loss at each node is a quadratic, and show that the tail-index can be estimated as a function of the step-size, batch-size, and the topological properties of the network of the computational nodes.

Stochastic Optimization

A Variance-Reduced Stochastic Accelerated Primal Dual Algorithm

no code implementations19 Feb 2022 Bugra Can, Mert Gurbuzbalaban, Necdet Serhat Aybat

In this work, we consider strongly convex strongly concave (SCSC) saddle point (SP) problems $\min_{x\in\mathbb{R}^{d_x}}\max_{y\in\mathbb{R}^{d_y}}f(x, y)$ where $f$ is $L$-smooth, $f(., y)$ is $\mu$-strongly convex for every $y$, and $f(x,.

Boundary Conditions for Linear Exit Time Gradient Trajectories Around Saddle Points: Analysis and Algorithm

no code implementations7 Jan 2021 Rishabh Dixit, Mert Gurbuzbalaban, Waheed U. Bajwa

This paper concerns convergence of first-order discrete methods to a local minimum of nonconvex optimization problems that comprise strict-saddle points within the geometrical landscape.

Breaking Reversibility Accelerates Langevin Dynamics for Non-Convex Optimization

no code implementations NeurIPS 2020 Xuefeng Gao, Mert Gurbuzbalaban, Lingjiong Zhu

We study two variants that are based on non-reversible Langevin diffusions: the underdamped Langevin dynamics (ULD) and the Langevin dynamics with a non-symmetric drift (NLD).

Differentially Private Accelerated Optimization Algorithms

1 code implementation5 Aug 2020 Nurdan Kuru, Ş. İlker Birbil, Mert Gurbuzbalaban, Sinan Yildirim

The first algorithm is inspired by Polyak's heavy ball method and employs a smoothing approach to decrease the accumulated noise on the gradient steps required for differential privacy.

The Heavy-Tail Phenomenon in SGD

1 code implementation8 Jun 2020 Mert Gurbuzbalaban, Umut Şimşekli, Lingjiong Zhu

We claim that depending on the structure of the Hessian of the loss at the minimum, and the choices of the algorithm parameters $\eta$ and $b$, the SGD iterates will converge to a \emph{heavy-tailed} stationary distribution.

Exit Time Analysis for Approximations of Gradient Descent Trajectories Around Saddle Points

no code implementations1 Jun 2020 Rishabh Dixit, Mert Gurbuzbalaban, Waheed U. Bajwa

This paper considers the problem of understanding the exit time for trajectories of gradient-related first-order methods from saddle neighborhoods under some initial boundary conditions.

Fractional moment-preserving initialization schemes for training deep neural networks

no code implementations25 May 2020 Mert Gurbuzbalaban, Yuanhan Hu

We prove that the logarithm of the norm of the network outputs, if properly scaled, will converge to a Gaussian distribution with an explicit mean and variance we can compute depending on the activation used, the value of s chosen and the network width.

Non-Convex Optimization via Non-Reversible Stochastic Gradient Langevin Dynamics

no code implementations6 Apr 2020 Yuanhan Hu, Xiaoyu Wang, Xuefeng Gao, Mert Gurbuzbalaban, Lingjiong Zhu

In this paper, we study the non reversible Stochastic Gradient Langevin Dynamics (NSGLD) which is based on discretization of the non-reversible Langevin diffusion.

Stochastic Optimization

Robust Distributed Accelerated Stochastic Gradient Methods for Multi-Agent Networks

no code implementations19 Oct 2019 Alireza Fallah, Mert Gurbuzbalaban, Asuman Ozdaglar, Umut Simsekli, Lingjiong Zhu

When gradients do not contain noise, we also prove that distributed accelerated methods can \emph{achieve acceleration}, requiring $\mathcal{O}(\kappa \log(1/\varepsilon))$ gradient evaluations and $\mathcal{O}(\kappa \log(1/\varepsilon))$ communications to converge to the same fixed point with the non-accelerated variant where $\kappa$ is the condition number and $\varepsilon$ is the target accuracy.

Stochastic Optimization

A Universally Optimal Multistage Accelerated Stochastic Gradient Method

no code implementations NeurIPS 2019 Necdet Serhat Aybat, Alireza Fallah, Mert Gurbuzbalaban, Asuman Ozdaglar

We study the problem of minimizing a strongly convex, smooth function when we have noisy estimates of its gradient.

Accelerated Linear Convergence of Stochastic Momentum Methods in Wasserstein Distances

no code implementations22 Jan 2019 Bugra Can, Mert Gurbuzbalaban, Lingjiong Zhu

In the special case of strongly convex quadratic objectives, we can show accelerated linear rates in the $p$-Wasserstein metric for any $p\geq 1$ with improved sensitivity to noise for both AG and HB through a non-asymptotic analysis under some additional assumptions on the noise structure.

A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks

1 code implementation18 Jan 2019 Umut Simsekli, Levent Sagun, Mert Gurbuzbalaban

This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion.

Breaking Reversibility Accelerates Langevin Dynamics for Global Non-Convex Optimization

no code implementations19 Dec 2018 Xuefeng Gao, Mert Gurbuzbalaban, Lingjiong Zhu

We study two variants that are based on non-reversible Langevin diffusions: the underdamped Langevin dynamics (ULD) and the Langevin dynamics with a non-symmetric drift (NLD).

Robust Accelerated Gradient Methods for Smooth Strongly Convex Functions

no code implementations27 May 2018 Necdet Serhat Aybat, Alireza Fallah, Mert Gurbuzbalaban, Asuman Ozdaglar

We study the trade-offs between convergence rate and robustness to gradient errors in designing a first-order algorithm.

When Cyclic Coordinate Descent Outperforms Randomized Coordinate Descent

no code implementations NeurIPS 2017 Mert Gurbuzbalaban, Asuman Ozdaglar, Pablo A. Parrilo, Nuri Vanli

The coordinate descent (CD) method is a classical optimization algorithm that has seen a revival of interest because of its competitive performance in machine learning applications.

Avoiding Communication in Proximal Methods for Convex Optimization Problems

no code implementations24 Oct 2017 Saeed Soori, Aditya Devarakonda, James Demmel, Mert Gurbuzbalaban, Maryam Mehri Dehnavi

We formulate the algorithm for two different optimization methods on the Lasso problem and show that the latency cost is reduced by a factor of k while bandwidth and floating-point operation costs remain the same.

Cannot find the paper you are looking for? You can Submit a new open access paper.