no code implementations • 6 Mar 2024 • Arun Jambulapati, Syamantak Kumar, Jerry Li, Shourya Pandey, Ankit Pensia, Kevin Tian
The $k$-principal component analysis ($k$-PCA) problem is a fundamental algorithmic primitive that is widely-used in data analysis and dimensionality reduction applications.
1 code implementation • 20 Feb 2024 • Lunjia Hu, Kevin Tian, Chutong Yang
Motivated by [BGHN23], which proposed a rigorous framework for measuring distances to calibration, we initiate the algorithmic study of calibration through the lens of property testing.
no code implementations • 7 Aug 2023 • Jonathan A. Kelner, Jerry Li, Allen Liu, Aaron Sidford, Kevin Tian
In the well-studied setting where $\mathbf{M}$ has incoherent row and column spans, our algorithms complete $\mathbf{M}$ to high precision from $mr^{2+o(1)}$ observations in $mr^{3 + o(1)}$ time (omitting logarithmic factors in problem parameters), improving upon the prior state-of-the-art [JN15] which used $\approx mr^5$ samples and $\approx mr^7$ time.
no code implementations • 13 Feb 2023 • Sivakanth Gopi, Yin Tat Lee, Daogao Liu, Ruoqi Shen, Kevin Tian
The development of efficient sampling algorithms catering to non-Euclidean geometries has been a challenging endeavor, as discretization techniques which succeed in the Euclidean setting do not readily carry over to more general settings.
no code implementations • 1 Jan 2023 • Yair Carmon, Arun Jambulapati, Yujia Jin, Yin Tat Lee, Daogao Liu, Aaron Sidford, Kevin Tian
We give a parallel algorithm obtaining optimization error $\epsilon_{\text{opt}}$ with $d^{1/3}\epsilon_{\text{opt}}^{-2/3}$ gradient oracle query depth and $d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator.
no code implementations • 18 Jul 2022 • Sivakanth Gopi, Yin Tat Lee, Daogao Liu, Ruoqi Shen, Kevin Tian
We propose a new framework for differentially private optimization of convex functions which are Lipschitz in an arbitrary norm $\|\cdot\|$.
no code implementations • 8 Mar 2022 • Jonathan A. Kelner, Jerry Li, Allen Liu, Aaron Sidford, Kevin Tian
We design a new iterative method tailored to the geometry of sparse recovery which is provably robust to our semi-random model.
no code implementations • 9 Feb 2022 • Yujia Jin, Aaron Sidford, Kevin Tian
We generalize our algorithms for minimax and finite sum optimization to solve a natural family of minimax finite sum optimization problems at an accelerated rate, encapsulating both above results up to a logarithmic factor.
no code implementations • NeurIPS 2021 • Arun Jambulapati, Jerry Li, Tselil Schramm, Kevin Tian
For the general case of smooth GLMs (e. g. logistic regression), we show that the robust gradient descent framework of Prasad et.
no code implementations • 16 Jun 2021 • Ilias Diakonikolas, Daniel M. Kane, Daniel Kongsgaard, Jerry Li, Kevin Tian
We leverage this result, together with additional techniques, to obtain the first almost-linear time algorithms for clustering mixtures of $k$ separated well-behaved distributions, nearly-matching the statistical guarantees of spectral methods.
no code implementations • NeurIPS 2021 • Yin Tat Lee, Ruoqi Shen, Kevin Tian
We give lower bounds on the performance of two of the most popular sampling methods in practice, the Metropolis-adjusted Langevin algorithm (MALA) and multi-step Hamiltonian Monte Carlo (HMC) with a leapfrog integrator, when applied to well-conditioned distributions.
no code implementations • NeurIPS 2021 • Ilias Diakonikolas, Daniel M. Kane, Daniel Kongsgaard, Jerry Li, Kevin Tian
Our algorithm runs in time $\widetilde{O}(ndk)$ for all $k = O(\sqrt{d}) \cup \Omega(d)$, where $n$ is the size of the dataset.
no code implementations • 12 Nov 2020 • Michael B. Cohen, Aaron Sidford, Kevin Tian
We show that standard extragradient methods (i. e. mirror prox and dual extrapolation) recover optimal accelerated rates for first-order minimization of smooth convex functions.
no code implementations • 7 Oct 2020 • Yin Tat Lee, Ruoqi Shen, Kevin Tian
For composite densities $\exp(-f(x) - g(x))$, where $f$ has condition number $\kappa$ and convex (but possibly non-smooth) $g$ admits an RGO, we obtain a mixing time of $O(\kappa d \log^3\frac{\kappa d}{\epsilon})$, matching the state-of-the-art non-composite bound; no composite samplers with better mixing than general-purpose logconcave samplers were previously known.
no code implementations • 17 Sep 2020 • Yair Carmon, Yujia Jin, Aaron Sidford, Kevin Tian
For linear regression with an elementwise nonnegative matrix, our guarantees improve on exact gradient methods by a factor of $\sqrt{\mathrm{nnz}(A)/(m+n)}$.
no code implementations • 4 Aug 2020 • Arun Jambulapati, Jerry Li, Christopher Musco, Aaron Sidford, Kevin Tian
In this paper, we revisit the decades-old problem of how to best improve $\mathbf{A}$'s condition number by left or right diagonal rescaling.
no code implementations • NeurIPS 2020 • Arun Jambulapati, Jerry Li, Kevin Tian
We develop two methods for the following fundamental statistical task: given an $\epsilon$-corrupted set of $n$ samples from a $d$-dimensional sub-Gaussian distribution, return an approximate top eigenvector of the covariance matrix.
no code implementations • 10 Jun 2020 • Ruoqi Shen, Kevin Tian, Yin Tat Lee
We consider sampling from composite densities on $\mathbb{R}^d$ of the form $d\pi(x) \propto \exp(-f(x) - g(x))dx$ for well-conditioned $f$ and convex (but possibly non-smooth) $g$, a family generalizing restrictions to a convex set, through the abstraction of a restricted Gaussian oracle.
no code implementations • 10 Feb 2020 • Yin Tat Lee, Ruoqi Shen, Kevin Tian
We show that the gradient norm $\|\nabla f(x)\|$ for $x \sim \exp(-f(x))$, where $f$ is strongly convex and smooth, concentrates tightly around its mean.
no code implementations • NeurIPS 2019 • Arun Jambulapati, Aaron Sidford, Kevin Tian
Optimal transportation, or computing the Wasserstein or ``earth mover's'' distance between two $n$-dimensional distributions, is a fundamental primitive which arises in many learning and statistical settings.
no code implementations • NeurIPS 2019 • Yair Carmon, Yujia Jin, Aaron Sidford, Kevin Tian
We present a randomized primal-dual algorithm that solves the problem $\min_{x} \max_{y} y^\top A x$ to additive error $\epsilon$ in time $\mathrm{nnz}(A) + \sqrt{\mathrm{nnz}(A)n}/\epsilon$, for matrix $A$ with larger dimension $n$ and $\mathrm{nnz}(A)$ nonzero entries.
no code implementations • 3 Jun 2019 • Arun Jambulapati, Aaron Sidford, Kevin Tian
Optimal transportation, or computing the Wasserstein or ``earth mover's'' distance between two distributions, is a fundamental primitive which arises in many learning and statistical settings.
no code implementations • 7 Mar 2019 • Yair Carmon, John C. Duchi, Aaron Sidford, Kevin Tian
We show that a simple randomized sketch of the matrix multiplicative weight (MMW) update enjoys (in expectation) the same regret bounds as MMW, up to a small constant factor.
1 code implementation • ICML 2018 • Kevin Tian, Teng Zhang, James Zou
However, in addition to the text data itself, we often have additional covariates associated with individual corpus documents---e. g. the demographic of the author, time and venue of publication---and we would like the embedding to naturally capture this information.
no code implementations • ICLR 2018 • Kevin Tian, Teng Zhang, James Zou
In addition to the text data itself, we often have additional covariates associated with individual documents in the corpus---e. g. the demographic of the author, time and venue of publication, etc.---and we would like the embedding to naturally capture the information of the covariates.
no code implementations • NeurIPS 2017 • Kevin Tian, Weihao Kong, Gregory Valiant
Consider the following estimation problem: there are $n$ entities, each with an unknown parameter $p_i \in [0, 1]$, and we observe $n$ independent random variables, $X_1,\ldots, X_n$, with $X_i \sim $ Binomial$(t, p_i)$.