Search Results for author: Yuanhan Hu

Found 6 papers, 0 papers with code

Cyclic and Randomized Stepsizes Invoke Heavier Tails in SGD than Constant Stepsize

no code implementations10 Feb 2023 Mert Gürbüzbalaban, Yuanhan Hu, Umut Şimşekli, Lingjiong Zhu

Our results bring a new understanding of the benefits of cyclic and randomized stepsizes compared to constant stepsize in terms of the tail behavior.

Scheduling

Penalized Overdamped and Underdamped Langevin Monte Carlo Algorithms for Constrained Sampling

no code implementations29 Nov 2022 Mert Gürbüzbalaban, Yuanhan Hu, Lingjiong Zhu

When $f$ is smooth and gradients are available, we get $\tilde{\mathcal{O}}(d/\varepsilon^{10})$ iteration complexity for PLD to sample the target up to an $\varepsilon$-error where the error is measured in the TV distance and $\tilde{\mathcal{O}}(\cdot)$ hides logarithmic factors.

Heavy-Tail Phenomenon in Decentralized SGD

no code implementations13 May 2022 Mert Gurbuzbalaban, Yuanhan Hu, Umut Simsekli, Kun Yuan, Lingjiong Zhu

To have a more explicit control on the tail exponent, we then consider the case where the loss at each node is a quadratic, and show that the tail-index can be estimated as a function of the step-size, batch-size, and the topological properties of the network of the computational nodes.

Stochastic Optimization

Decentralized Stochastic Gradient Langevin Dynamics and Hamiltonian Monte Carlo

no code implementations1 Jul 2020 Mert Gürbüzbalaban, Xuefeng Gao, Yuanhan Hu, Lingjiong Zhu

Stochastic gradient Langevin dynamics (SGLD) and stochastic gradient Hamiltonian Monte Carlo (SGHMC) are two popular Markov Chain Monte Carlo (MCMC) algorithms for Bayesian inference that can scale to large datasets, allowing to sample from the posterior distribution of the parameters of a statistical model given the input data and the prior distribution over the model parameters.

Bayesian Inference regression

Fractional moment-preserving initialization schemes for training deep neural networks

no code implementations25 May 2020 Mert Gurbuzbalaban, Yuanhan Hu

We prove that the logarithm of the norm of the network outputs, if properly scaled, will converge to a Gaussian distribution with an explicit mean and variance we can compute depending on the activation used, the value of s chosen and the network width.

Non-Convex Optimization via Non-Reversible Stochastic Gradient Langevin Dynamics

no code implementations6 Apr 2020 Yuanhan Hu, Xiaoyu Wang, Xuefeng Gao, Mert Gurbuzbalaban, Lingjiong Zhu

In this paper, we study the non reversible Stochastic Gradient Langevin Dynamics (NSGLD) which is based on discretization of the non-reversible Langevin diffusion.

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.