Sampling from the Mean-Field Stationary Distribution

no code implementations12 Feb 2024 Yunbum Kook, Matthew S. Zhang, Sinho Chewi, Murat A. Erdogdu, Mufan Bill Li

We study the complexity of sampling from the stationary distribution of a mean-field SDE, or equivalently, the complexity of minimizing a functional over the space of probability measures which includes an interaction term.

Differential Equation Scaling Limits of Shaped and Unshaped Neural Networks

no code implementations18 Oct 2023 Mufan Bill Li, Mihai Nica

Secondly, for an unshaped MLP at initialization, we derive the first order asymptotic correction to the layerwise correlation.

Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit

no code implementations28 Sep 2023 Blake Bordelon, Lorenzo Noci, Mufan Bill Li, Boris Hanin, Cengiz Pehlevan

We provide experiments demonstrating that residual architectures including convolutional ResNets and Vision Transformers trained with this parameterization exhibit transfer of optimal hyperparameters across width and depth on CIFAR-10 and ImageNet.

The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

no code implementations NeurIPS 2023 Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy

Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention model with skip connections in the proportional limit of infinite-depth-and-width.

Deep Attention Learning Theory

Improved Discretization Analysis for Underdamped Langevin Monte Carlo

no code implementations16 Feb 2023 Matthew Zhang, Sinho Chewi, Mufan Bill Li, Krishnakumar Balasubramanian, Murat A. Erdogdu

As a byproduct, we also obtain the first KL divergence guarantees for ULMC without Hessian smoothness under strong log-concavity, which is based on a new result on the log-Sobolev constant along the underdamped Langevin diffusion.

Analysis of Langevin Monte Carlo from Poincaré to Log-Sobolev

no code implementations23 Dec 2021 Sinho Chewi, Murat A. Erdogdu, Mufan Bill Li, Ruoqi Shen, Matthew Zhang

Classically, the continuous-time Langevin diffusion converges exponentially fast to its stationary distribution $\pi$ under the sole assumption that $\pi$ satisfies a Poincar\'e inequality.

The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization

no code implementations NeurIPS 2021 Mufan Bill Li, Mihai Nica, Daniel M. Roy

To provide a better approximation, we study ReLU ResNets in the infinite-depth-and-width limit, where both depth and width tend to infinity as their ratio, $d/n$, remains constant.

Gaussian Processes

Higher Order Generalization Error for First Order Discretization of Langevin Diffusion

no code implementations11 Feb 2021 Mufan Bill Li, Maxime Gazeau

We propose a novel approach to analyze generalization error for discretizations of Langevin diffusion, such as the stochastic gradient Langevin dynamics (SGLD).

Riemannian Langevin Algorithm for Solving Semidefinite Programs

no code implementations21 Oct 2020 Mufan Bill Li, Murat A. Erdogdu

We propose a Langevin diffusion-based algorithm for non-convex optimization and sampling on a product manifold of spheres.

