Search Results for author: Mert Gürbüzbalaban

Found 12 papers, 4 papers with code

Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms

no code implementations NeurIPS 2021 Alexander Camuto, George Deligiannidis, Murat A. Erdogdu, Mert Gürbüzbalaban, Umut Şimşekli, Lingjiong Zhu

As our main contribution, we prove that the generalization error of a stochastic optimization algorithm can be bounded based on the `complexity' of the fractal structure that underlies its invariant measure.

Generalization Bounds Learning Theory +1

TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block Inversion

1 code implementation7 Jun 2021 Saeed Soori, Bugra Can, Baourun Mu, Mert Gürbüzbalaban, Maryam Mehri Dehnavi

This work proposes a time-efficient Natural Gradient Descent method, called TENGraD, with linear convergence guarantees.

Image Classification

Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance

no code implementations NeurIPS 2021 Hongjian Wang, Mert Gürbüzbalaban, Lingjiong Zhu, Umut Şimşekli, Murat A. Erdogdu

In this paper, we provide convergence guarantees for SGD under a state-dependent and heavy-tailed noise with a potentially infinite variance, for a class of strongly convex objectives.

Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections

1 code implementation13 Feb 2021 Alexander Camuto, Xiaoyu Wang, Lingjiong Zhu, Chris Holmes, Mert Gürbüzbalaban, Umut Şimşekli

In this paper we focus on the so-called `implicit effect' of GNIs, which is the effect of the injected noise on the dynamics of SGD.

Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian Monte Carlo

no code implementations1 Jul 2020 Mert Gürbüzbalaban, Xuefeng Gao, Yuanhan Hu, Lingjiong Zhu

Stochastic gradient Langevin dynamics (SGLD) and stochastic gradient Hamiltonian Monte Carlo (SGHMC) are two popular Markov Chain Monte Carlo (MCMC) algorithms for Bayesian inference that can scale to large datasets, allowing to sample from the posterior distribution of the parameters of a statistical model given the input data and the prior distribution over the model parameters.

Bayesian Inference

IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method

no code implementations NeurIPS 2020 Yossi Arjevani, Joan Bruna, Bugra Can, Mert Gürbüzbalaban, Stefanie Jegelka, Hongzhou Lin

We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex.

A Stochastic Subgradient Method for Distributionally Robust Non-Convex Learning

no code implementations8 Jun 2020 Mert Gürbüzbalaban, Andrzej Ruszczyński, Landi Zhu

We consider a distributionally robust formulation of stochastic optimization problems arising in statistical learning, where robustness is with respect to uncertainty in the underlying data distribution.

Stochastic Optimization

On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks

no code implementations29 Nov 2019 Umut Şimşekli, Mert Gürbüzbalaban, Thanh Huy Nguyen, Gaël Richard, Levent Sagun

This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion.

First Exit Time Analysis of Stochastic Gradient Descent Under Heavy-Tailed Gradient Noise

1 code implementation NeurIPS 2019 Thanh Huy Nguyen, Umut Şimşekli, Mert Gürbüzbalaban, Gaël Richard

We show that the behaviors of the two systems are indeed similar for small step-sizes and we identify how the error depends on the algorithm and problem parameters.

Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for Non-Convex Stochastic Optimization: Non-Asymptotic Performance Bounds and Momentum-Based Acceleration

no code implementations12 Sep 2018 Xuefeng Gao, Mert Gürbüzbalaban, Lingjiong Zhu

We provide finite-time performance bounds for the global convergence of both SGHMC variants for solving stochastic non-convex optimization problems with explicit constants.

Stochastic Optimization

Surpassing Gradient Descent Provably: A Cyclic Incremental Method with Linear Convergence Rate

no code implementations1 Nov 2016 Aryan Mokhtari, Mert Gürbüzbalaban, Alejandro Ribeiro

We prove that not only the proposed DIAG method converges linearly to the optimal solution, but also its linear convergence factor justifies the advantage of incremental methods on GD.

Cannot find the paper you are looking for? You can Submit a new open access paper.