no code implementations • 12 Feb 2024 • Yunbum Kook, Matthew S. Zhang, Sinho Chewi, Murat A. Erdogdu, Mufan Bill Li
We study the complexity of sampling from the stationary distribution of a mean-field SDE, or equivalently, the complexity of minimizing a functional over the space of probability measures which includes an interaction term.
no code implementations • 18 Oct 2023 • Mufan Bill Li, Mihai Nica
Secondly, for an unshaped MLP at initialization, we derive the first order asymptotic correction to the layerwise correlation.
no code implementations • 28 Sep 2023 • Blake Bordelon, Lorenzo Noci, Mufan Bill Li, Boris Hanin, Cengiz Pehlevan
We provide experiments demonstrating that residual architectures including convolutional ResNets and Vision Transformers trained with this parameterization exhibit transfer of optimal hyperparameters across width and depth on CIFAR-10 and ImageNet.
no code implementations • NeurIPS 2023 • Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy
Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention model with skip connections in the proportional limit of infinite-depth-and-width.
no code implementations • 16 Feb 2023 • Matthew Zhang, Sinho Chewi, Mufan Bill Li, Krishnakumar Balasubramanian, Murat A. Erdogdu
As a byproduct, we also obtain the first KL divergence guarantees for ULMC without Hessian smoothness under strong log-concavity, which is based on a new result on the log-Sobolev constant along the underdamped Langevin diffusion.
no code implementations • 6 Jun 2022 • Mufan Bill Li, Mihai Nica, Daniel M. Roy
In this work, we study the distribution of this random matrix.
no code implementations • 23 Dec 2021 • Sinho Chewi, Murat A. Erdogdu, Mufan Bill Li, Ruoqi Shen, Matthew Zhang
Classically, the continuous-time Langevin diffusion converges exponentially fast to its stationary distribution $\pi$ under the sole assumption that $\pi$ satisfies a Poincar\'e inequality.
no code implementations • NeurIPS 2021 • Mufan Bill Li, Mihai Nica, Daniel M. Roy
To provide a better approximation, we study ReLU ResNets in the infinite-depth-and-width limit, where both depth and width tend to infinity as their ratio, $d/n$, remains constant.
no code implementations • 11 Feb 2021 • Mufan Bill Li, Maxime Gazeau
We propose a novel approach to analyze generalization error for discretizations of Langevin diffusion, such as the stochastic gradient Langevin dynamics (SGLD).
no code implementations • 21 Oct 2020 • Mufan Bill Li, Murat A. Erdogdu
We propose a Langevin diffusion-based algorithm for non-convex optimization and sampling on a product manifold of spheres.