no code implementations • 17 Feb 2022 • Frederic Koehler, Holden Lee, Andrej Risteski
We consider Ising models on the hypercube with a general interaction matrix $J$, and give a polynomial time sampling algorithm when all but $O(1)$ eigenvalues of $J$ lie in an interval of length one, a situation which occurs in many models of interest.
no code implementations • NeurIPS 2021 • Holden Lee, Chirag Pabbaraju, Anish Prasad Sevekari, Andrej Risteski
As ill-conditioned Jacobians are an obstacle for likelihood-based training, the fundamental question remains: which distributions can be approximated using well-conditioned affine coupling flows?
no code implementations • ICML Workshop INNF 2021 • Holden Lee, Chirag Pabbaraju, Anish Sevekari, Andrej Risteski
As ill-conditioned Jacobians are an obstacle for likelihood-based training, the fundamental question remains: which distributions can be approximated using well-conditioned affine coupling flows?
no code implementations • NeurIPS 2021 • Holden Lee
Identification of a linear time-invariant dynamical system from partial observations is a fundamental problem in control theory.
1 code implementation • 19 Nov 2020 • Holden Lee
Identification of a linear time-invariant dynamical system from partial observations is a fundamental problem in control theory.
no code implementations • 30 Sep 2020 • Rong Ge, Holden Lee, Jianfeng Lu, Andrej Risteski
We give a algorithm for exact sampling from the Bingham distribution $p(x)\propto \exp(x^\top A x)$ on the sphere $\mathcal S^{d-1}$ with expected runtime of $\operatorname{poly}(d, \lambda_{\max}(A)-\lambda_{\min}(A))$.
no code implementations • 6 Feb 2020 • Udaya Ghai, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
This requires a refined regret analysis, including a structural lemma showing the current state of the system to be a small linear combination of past states, even if the state grows polynomially.
no code implementations • 8 Nov 2019 • Rong Ge, Holden Lee, Jianfeng Lu
Estimating the normalizing constant of an unnormalized probability distribution has important applications in computer science, statistical physics, machine learning, and statistics.
1 code implementation • NeurIPS 2019 • Rohith Kuditipudi, Xiang Wang, Holden Lee, Yi Zhang, Zhiyuan Li, Wei Hu, Sanjeev Arora, Rong Ge
Mode connectivity is a surprising phenomenon in the loss landscape of deep nets.
no code implementations • 23 May 2019 • Holden Lee, Cyril Zhang
The optimal predictor for a linear dynamical system (with hidden state and Gaussian noise) takes the form of an autoregressive linear filter, namely the Kalman filter.
1 code implementation • NeurIPS 2019 • Holden Lee, Oren Mangoubi, Nisheeth K. Vishnoi
Given a sequence of convex functions $f_0, f_1, \ldots, f_T$, we study the problem of sampling from the Gibbs distribution $\pi_t \propto e^{-\sum_{k=0}^tf_k}$ for each epoch $t$ in an online manner.
no code implementations • 29 Nov 2018 • Rong Ge, Holden Lee, Andrej Risteski
Previous approaches rely on decomposing the state space as a partition of sets, while our approach can be thought of as decomposing the stationary measure as a mixture of distributions (a "soft partition").
no code implementations • NeurIPS 2018 • Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
We give a polynomial-time algorithm for learning latent-state linear dynamical systems without system identification, and without assumptions on the spectral radius of the system's transition matrix.
no code implementations • ICLR 2018 • Sanjeev Arora, Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
We study the control of symmetric linear dynamical systems with unknown dynamics and a hidden state.
no code implementations • NeurIPS 2018 • Rong Ge, Holden Lee, Andrej Risteski
We analyze this Markov chain for the canonical multi-modal distribution: a mixture of gaussians (of equal variance).
no code implementations • 22 Feb 2017 • Holden Lee, Rong Ge, Tengyu Ma, Andrej Risteski, Sanjeev Arora
We take a first cut at explaining the expressivity of multilayer nets by giving a sufficient criterion for a function to be approximable by a neural network with $n$ hidden layers.