Search Results for author: Mert Gürbüzbalaban

Found 17 papers, 4 papers with code

Differential Privacy of Noisy (S)GD under Heavy-Tailed Perturbations

no code implementations • 4 Mar 2024 • Umut Şimşekli, Mert Gürbüzbalaban, Sinan Yildirim, Lingjiong Zhu

Injecting heavy-tailed noise to the iterates of stochastic gradient descent (SGD) has received increasing attention over the past few years.

Learning Theory

Paper
Add Code

Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD than Constant Stepsize

no code implementations • 10 Feb 2023 • Mert Gürbüzbalaban, Yuanhan Hu, Umut Şimşekli, Lingjiong Zhu

Our results bring a new understanding of the benefits of cyclic and randomized stepsizes compared to constant stepsize in terms of the tail behavior.

Scheduling

Paper
Add Code

Algorithmic Stability of Heavy-Tailed SGD with General Loss Functions

no code implementations • 27 Jan 2023 • Anant Raj, Lingjiong Zhu, Mert Gürbüzbalaban, Umut Şimşekli

Very recently, new generalization bounds have been proven, indicating a non-monotonic relationship between the generalization error and heavy tails, which is more pertinent to the reported empirical observations.

Generalization Bounds

Paper
Add Code

Penalized Overdamped and Underdamped Langevin Monte Carlo Algorithms for Constrained Sampling

no code implementations • 29 Nov 2022 • Mert Gürbüzbalaban, Yuanhan Hu, Lingjiong Zhu

When $f$ is smooth and gradients are available, we get $\tilde{\mathcal{O}}(d/\varepsilon^{10})$ iteration complexity for PLD to sample the target up to an $\varepsilon$-error where the error is measured in the TV distance and $\tilde{\mathcal{O}}(\cdot)$ hides logarithmic factors.

Paper
Add Code

Algorithmic Stability of Heavy-Tailed Stochastic Gradient Descent on Least Squares

no code implementations • 2 Jun 2022 • Anant Raj, Melih Barsbey, Mert Gürbüzbalaban, Lingjiong Zhu, Umut Şimşekli

Recent studies have shown that heavy tails can emerge in stochastic optimization and that the heaviness of the tails have links to the generalization error.

Stochastic Optimization

Paper
Add Code

Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms

no code implementations • NeurIPS 2021 • Alexander Camuto, George Deligiannidis, Murat A. Erdogdu, Mert Gürbüzbalaban, Umut Şimşekli, Lingjiong Zhu

As our main contribution, we prove that the generalization error of a stochastic optimization algorithm can be bounded based on the `complexity' of the fractal structure that underlies its invariant measure.

Generalization Bounds Learning Theory +1

Paper
Add Code

TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block Inversion

1 code implementation • 7 Jun 2021 • Saeed Soori, Bugra Can, Baourun Mu, Mert Gürbüzbalaban, Maryam Mehri Dehnavi

This work proposes a time-efficient Natural Gradient Descent method, called TENGraD, with linear convergence guarantees.

Image Classification

Paper
Code

Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance

no code implementations • NeurIPS 2021 • Hongjian Wang, Mert Gürbüzbalaban, Lingjiong Zhu, Umut Şimşekli, Murat A. Erdogdu

In this paper, we provide convergence guarantees for SGD under a state-dependent and heavy-tailed noise with a potentially infinite variance, for a class of strongly convex objectives.

Paper
Add Code

Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections

1 code implementation • 13 Feb 2021 • Alexander Camuto, Xiaoyu Wang, Lingjiong Zhu, Chris Holmes, Mert Gürbüzbalaban, Umut Şimşekli

In this paper we focus on the so-called `implicit effect' of GNIs, which is the effect of the injected noise on the dynamics of SGD.

Paper
Code

Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian Monte Carlo

no code implementations • 1 Jul 2020 • Mert Gürbüzbalaban, Xuefeng Gao, Yuanhan Hu, Lingjiong Zhu

Stochastic gradient Langevin dynamics (SGLD) and stochastic gradient Hamiltonian Monte Carlo (SGHMC) are two popular Markov Chain Monte Carlo (MCMC) algorithms for Bayesian inference that can scale to large datasets, allowing to sample from the posterior distribution of the parameters of a statistical model given the input data and the prior distribution over the model parameters.

Bayesian Inference regression

Paper
Add Code

IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method

no code implementations • NeurIPS 2020 • Yossi Arjevani, Joan Bruna, Bugra Can, Mert Gürbüzbalaban, Stefanie Jegelka, Hongzhou Lin

We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex.

Paper
Add Code

A Stochastic Subgradient Method for Distributionally Robust Non-Convex Learning

no code implementations • 8 Jun 2020 • Mert Gürbüzbalaban, Andrzej Ruszczyński, Landi Zhu

We consider a distributionally robust formulation of stochastic optimization problems arising in statistical learning, where robustness is with respect to uncertainty in the underlying data distribution.

Stochastic Optimization

Paper
Add Code

Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise

1 code implementation • ICML 2020 • Umut Şimşekli, Lingjiong Zhu, Yee Whye Teh, Mert Gürbüzbalaban

Stochastic gradient descent with momentum (SGDm) is one of the most popular optimization algorithms in deep learning.

Paper
Code

On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks

no code implementations • 29 Nov 2019 • Umut Şimşekli, Mert Gürbüzbalaban, Thanh Huy Nguyen, Gaël Richard, Levent Sagun

This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion.

Paper
Add Code

First Exit Time Analysis of Stochastic Gradient Descent Under Heavy-Tailed Gradient Noise

1 code implementation • NeurIPS 2019 • Thanh Huy Nguyen, Umut Şimşekli, Mert Gürbüzbalaban, Gaël Richard

We show that the behaviors of the two systems are indeed similar for small step-sizes and we identify how the error depends on the algorithm and problem parameters.

Computational Efficiency

Paper
Code

Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for Non-Convex Stochastic Optimization: Non-Asymptotic Performance Bounds and Momentum-Based Acceleration

no code implementations • 12 Sep 2018 • Xuefeng Gao, Mert Gürbüzbalaban, Lingjiong Zhu

We provide finite-time performance bounds for the global convergence of both SGHMC variants for solving stochastic non-convex optimization problems with explicit constants.

Stochastic Optimization

Paper
Add Code

Surpassing Gradient Descent Provably: A Cyclic Incremental Method with Linear Convergence Rate

no code implementations • 1 Nov 2016 • Aryan Mokhtari, Mert Gürbüzbalaban, Alejandro Ribeiro

We prove that not only the proposed DIAG method converges linearly to the optimal solution, but also its linear convergence factor justifies the advantage of incremental methods on GD.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.