no code implementations • 3 Jun 2023 • Chirag Pabbaraju, Dhruv Rohatgi, Anish Sevekari, Holden Lee, Ankur Moitra, Andrej Risteski
In this work, we give the first example of a natural exponential family of distributions such that the score matching loss is computationally efficient to optimize, and has a comparable statistical efficiency to ML, while the ML loss is intractable to optimize using a gradient-based method.
no code implementations • 19 May 2023 • Sitan Chen, Sinho Chewi, Holden Lee, Yuanzhi Li, Jianfeng Lu, Adil Salim
We provide the first polynomial-time convergence guarantees for the probability flow ODE implementation (together with a corrector step) of score-based generative modeling.
no code implementations • 3 Apr 2023 • Holden Lee, Zeyu Shen
In this paper, we present a new lower bound for parallel tempering on the spectral gap that has a polynomial dependence on all parameters except $\log L$, where $(L + 1)$ is the number of levels.
no code implementations • 3 Nov 2022 • Hongrui Chen, Holden Lee, Jianfeng Lu
We give an improved theoretical analysis of score-based generative modeling.
no code implementations • 5 Oct 2022 • Sinho Chewi, Patrik Gerber, Holden Lee, Chen Lu
We prove two lower bounds for the complexity of non-log-concave sampling within the framework of Balasubramanian et al. (2022), who introduced the use of Fisher information (FI) bounds as a notion of approximate first-order stationarity in sampling.
no code implementations • 1 Oct 2022 • Holden Lee, Chirag Pabbaraju, Anish Sevekari, Andrej Risteski
Noise Contrastive Estimation (NCE) is a popular approach for learning probability density functions parameterized up to a constant of proportionality.
no code implementations • 26 Sep 2022 • Holden Lee, Jianfeng Lu, Yixin Tan
Score-based generative modeling (SGM) has grown to be a hugely successful method for learning to generate samples from complex data distributions such as that of images and audio.
no code implementations • 13 Jun 2022 • Holden Lee, Jianfeng Lu, Yixin Tan
Using our guarantee, we give a theoretical analysis of score-based generative modeling, which transforms white-noise input into samples from a learned data distribution given score estimates at different noise scales.
no code implementations • 17 Feb 2022 • Frederic Koehler, Holden Lee, Andrej Risteski
We consider Ising models on the hypercube with a general interaction matrix $J$, and give a polynomial time sampling algorithm when all but $O(1)$ eigenvalues of $J$ lie in an interval of length one, a situation which occurs in many models of interest.
no code implementations • NeurIPS 2021 • Holden Lee, Chirag Pabbaraju, Anish Prasad Sevekari, Andrej Risteski
As ill-conditioned Jacobians are an obstacle for likelihood-based training, the fundamental question remains: which distributions can be approximated using well-conditioned affine coupling flows?
no code implementations • ICML Workshop INNF 2021 • Holden Lee, Chirag Pabbaraju, Anish Sevekari, Andrej Risteski
As ill-conditioned Jacobians are an obstacle for likelihood-based training, the fundamental question remains: which distributions can be approximated using well-conditioned affine coupling flows?
no code implementations • NeurIPS 2021 • Holden Lee
Identification of a linear time-invariant dynamical system from partial observations is a fundamental problem in control theory.
1 code implementation • 19 Nov 2020 • Holden Lee
Identification of a linear time-invariant dynamical system from partial observations is a fundamental problem in control theory.
no code implementations • 30 Sep 2020 • Rong Ge, Holden Lee, Jianfeng Lu, Andrej Risteski
We give a algorithm for exact sampling from the Bingham distribution $p(x)\propto \exp(x^\top A x)$ on the sphere $\mathcal S^{d-1}$ with expected runtime of $\operatorname{poly}(d, \lambda_{\max}(A)-\lambda_{\min}(A))$.
no code implementations • 6 Feb 2020 • Udaya Ghai, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
This requires a refined regret analysis, including a structural lemma showing the current state of the system to be a small linear combination of past states, even if the state grows polynomially.
no code implementations • 8 Nov 2019 • Rong Ge, Holden Lee, Jianfeng Lu
Estimating the normalizing constant of an unnormalized probability distribution has important applications in computer science, statistical physics, machine learning, and statistics.
1 code implementation • NeurIPS 2019 • Rohith Kuditipudi, Xiang Wang, Holden Lee, Yi Zhang, Zhiyuan Li, Wei Hu, Sanjeev Arora, Rong Ge
Mode connectivity is a surprising phenomenon in the loss landscape of deep nets.
no code implementations • 23 May 2019 • Holden Lee, Cyril Zhang
The optimal predictor for a linear dynamical system (with hidden state and Gaussian noise) takes the form of an autoregressive linear filter, namely the Kalman filter.
1 code implementation • NeurIPS 2019 • Holden Lee, Oren Mangoubi, Nisheeth K. Vishnoi
Given a sequence of convex functions $f_0, f_1, \ldots, f_T$, we study the problem of sampling from the Gibbs distribution $\pi_t \propto e^{-\sum_{k=0}^tf_k}$ for each epoch $t$ in an online manner.
no code implementations • 29 Nov 2018 • Rong Ge, Holden Lee, Andrej Risteski
Previous approaches rely on decomposing the state space as a partition of sets, while our approach can be thought of as decomposing the stationary measure as a mixture of distributions (a "soft partition").
no code implementations • NeurIPS 2018 • Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
We give a polynomial-time algorithm for learning latent-state linear dynamical systems without system identification, and without assumptions on the spectral radius of the system's transition matrix.
no code implementations • ICLR 2018 • Sanjeev Arora, Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
We study the control of symmetric linear dynamical systems with unknown dynamics and a hidden state.
no code implementations • NeurIPS 2018 • Rong Ge, Holden Lee, Andrej Risteski
We analyze this Markov chain for the canonical multi-modal distribution: a mixture of gaussians (of equal variance).
no code implementations • 22 Feb 2017 • Holden Lee, Rong Ge, Tengyu Ma, Andrej Risteski, Sanjeev Arora
We take a first cut at explaining the expressivity of multilayer nets by giving a sufficient criterion for a function to be approximable by a neural network with $n$ hidden layers.