no code implementations • 5 Sep 2024 • Sitan Chen, Jaume de Dios Pont, Jun-Ting Hsieh, Hsin-Yuan Huang, Jane Lange, Jerry Li
Previously, Huang, Chen, and Preskill proved a surprising result that even if $E$ is arbitrary, this task can be solved in time roughly $n^{O(\log(1/\epsilon))}$, where $\epsilon$ is the target prediction error.
no code implementations • 18 Jul 2024 • Sitan Chen, Jerry Li, Allen Liu
We give the first tight sample complexity bounds for shadow tomography and classical shadows in the regime where the target error is below some sufficiently small inverse polynomial in the dimension of the Hilbert space.
no code implementations • 6 Mar 2024 • Arun Jambulapati, Syamantak Kumar, Jerry Li, Shourya Pandey, Ankit Pensia, Kevin Tian
For an alternative well-studied approximation notion we term cPCA (correlation PCA), we tightly characterize the parameter regimes where deflation methods are feasible.
no code implementations • 26 Feb 2024 • Sitan Chen, Jerry Li, Allen Liu
In this work, we study tomography in the natural setting where one can make measurements of $t$ copies at a time.
1 code implementation • 24 Oct 2023 • Marah I Abdin, Suriya Gunasekar, Varun Chandrasekaran, Jerry Li, Mert Yuksekgonul, Rahee Ghosh Peshawaria, Ranjita Naik, Besmira Nushi
Motivated by rising concerns around factual incorrectness and hallucinations of LLMs, we present KITAB, a new dataset for measuring constraint satisfaction abilities of language models.
no code implementations • 7 Aug 2023 • Jonathan A. Kelner, Jerry Li, Allen Liu, Aaron Sidford, Kevin Tian
In the well-studied setting where $\mathbf{M}$ has incoherent row and column spans, our algorithms complete $\mathbf{M}$ to high precision from $mr^{2+o(1)}$ observations in $mr^{3 + o(1)}$ time (omitting logarithmic factors in problem parameters), improving upon the prior state-of-the-art [JN15] which used $\approx mr^5$ samples and $\approx mr^7$ time.
4 code implementations • 4 May 2023 • Reid Pryzant, Dan Iter, Jerry Li, Yin Tat Lee, Chenguang Zhu, Michael Zeng
Large Language Models (LLMs) have shown impressive performance as general purpose agents, but their abilities remain highly dependent on prompts which are hand written with onerous trial-and-error effort.
no code implementations • 5 Apr 2023 • Sinho Chewi, Jaume de Dios Pont, Jerry Li, Chen Lu, Shyam Narayanan
Log-concave sampling has witnessed remarkable algorithmic advances in recent years, but the corresponding problem of proving lower bounds for this task has remained elusive, with lower bounds previously known only in dimension one.
1 code implementation • ICCV 2023 • Nabeel Hingun, Chawin Sitawarin, Jerry Li, David Wagner
In this work, we propose the REAP (REalistic Adversarial Patch) benchmark, a digital benchmark that allows the user to evaluate patch attacks on real images, and under real-world conditions.
no code implementations • 11 Nov 2022 • Jerry Luo, Cosmin Paduraru, Octavian Voicu, Yuri Chervonyi, Scott Munns, Jerry Li, Crystal Qian, Praneet Dutta, Jared Quincy Davis, Ningjia Wu, Xingwei Yang, Chu-Ming Chang, Ted Li, Rob Rose, Mingyan Fan, Hootan Nakhost, Tinglin Liu, Brian Kirkman, Frank Altamura, Lee Cline, Patrick Tonker, Joel Gouker, Dave Uden, Warren Buddy Bryan, Jason Law, Deeni Fatiha, Neil Satra, Juliet Rothenberg, Mandeep Waraich, Molly Carlin, Satish Tallapaka, Sims Witherspoon, David Parish, Peter Dolan, Chenyu Zhao, Daniel J. Mankowitz
This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems.
no code implementations • 13 Oct 2022 • Sitan Chen, Jordan Cotler, Hsin-Yuan Huang, Jerry Li
The recent proliferation of NISQ devices has made it imperative to understand their computational power.
no code implementations • 22 Sep 2022 • Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, Anru R. Zhang
We provide theoretical convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL$\cdot$E 2.
no code implementations • 10 Jun 2022 • Sitan Chen, Brice Huang, Jerry Li, Allen Liu, Mark Sellke
We give an adaptive algorithm that outputs a state which is $\gamma$-close in infidelity to $\rho$ using only $\tilde{O}(d^3/\gamma)$ copies, which is optimal for incoherent measurements.
no code implementations • 31 May 2022 • Sitan Chen, Jerry Li, Yuanzhi Li
Motivated by the recent empirical successes of deep generative models, we study the computational complexity of the following unsupervised learning problem.
no code implementations • 14 Apr 2022 • Sitan Chen, Brice Huang, Jerry Li, Allen Liu
When $\sigma$ is the maximally mixed state $\frac{1}{d} I_d$, this is known as mixedness testing.
no code implementations • 8 Apr 2022 • Sitan Chen, Jerry Li, Yuanzhi Li, Anru R. Zhang
Our first main result is a polynomial-time algorithm for learning quadratic transformations of Gaussians in a smoothed setting.
no code implementations • 8 Mar 2022 • Jonathan A. Kelner, Jerry Li, Allen Liu, Aaron Sidford, Kevin Tian
We design a new iterative method tailored to the geometry of sparse recovery which is provably robust to our semi-random model.
no code implementations • ICLR 2022 • Sitan Chen, Jerry Li, Yuanzhi Li, Raghu Meka
Arguably the most fundamental question in the theory of generative adversarial networks (GANs) is to understand to what extent GANs can actually learn the underlying distribution.
no code implementations • 31 Dec 2021 • Sung Min Park, Kuo-An Wei, Kai Xiao, Jerry Li, Aleksander Madry
We identify properties of universal adversarial perturbations (UAPs) that distinguish them from standard adversarial perturbations.
no code implementations • 1 Dec 2021 • Jerry Li, Allen Liu
We give the first algorithm which runs in polynomial time, and which almost matches this guarantee.
1 code implementation • 1 Dec 2021 • Hsin-Yuan Huang, Michael Broughton, Jordan Cotler, Sitan Chen, Jerry Li, Masoud Mohseni, Hartmut Neven, Ryan Babbush, Richard Kueng, John Preskill, Jarrod R. McClean
Quantum technology has the potential to revolutionize how we acquire and process experimental data to learn about the physical world.
no code implementations • 10 Nov 2021 • Sitan Chen, Jordan Cotler, Hsin-Yuan Huang, Jerry Li
We study the power of quantum memory for learning properties of quantum systems and dynamics, which is of great importance in physics and chemistry.
no code implementations • 10 Nov 2021 • Sitan Chen, Jordan Cotler, Hsin-Yuan Huang, Jerry Li
We prove that given the ability to make entangled measurements on at most $k$ replicas of an $n$-qubit state $\rho$ simultaneously, there is a property of $\rho$ which requires at least order $2^n$ measurements to learn.
no code implementations • 25 Jun 2021 • Clément L. Canonne, Ayush Jain, Gautam Kamath, Jerry Li
Specifically, we show the sample complexity to be \[\tilde \Theta\left(\frac{\sqrt{n}}{\varepsilon_2^{2}} + \frac{n}{\log n} \cdot \max \left\{\frac{\varepsilon_1}{\varepsilon_2^2},\left(\frac{\varepsilon_1}{\varepsilon_2^2}\right)^{\!\! 2}\right\}\right),\] providing a smooth tradeoff between the two previously known cases.
no code implementations • NeurIPS 2021 • Arun Jambulapati, Jerry Li, Tselil Schramm, Kevin Tian
For the general case of smooth GLMs (e. g. logistic regression), we show that the robust gradient descent framework of Prasad et.
no code implementations • 16 Jun 2021 • Ilias Diakonikolas, Daniel M. Kane, Daniel Kongsgaard, Jerry Li, Kevin Tian
We leverage this result, together with additional techniques, to obtain the first almost-linear time algorithms for clustering mixtures of $k$ separated well-behaved distributions, nearly-matching the statistical guarantees of spectral methods.
no code implementations • 5 Jun 2021 • Jerry Li, Allen Liu, Ankur Moitra
Given $\textsf{poly}(k/\epsilon)$ samples from a distribution that is $\epsilon$-close in TV distance to a GMM with $k$ components, we can construct a GMM with $\widetilde{O}(k)$ components that approximates the distribution to within $\widetilde{O}(\epsilon)$ in $\textsf{poly}(k/\epsilon)$ time.
no code implementations • 25 Feb 2021 • Sitan Chen, Jerry Li, Ryan O'Donnell
We revisit the basic problem of quantum state certification: given copies of unknown mixed state $\rho\in\mathbb{C}^{d\times d}$ and the description of a mixed state $\sigma$, decide whether $\sigma = \rho$ or $\|\sigma - \rho\|_{\mathsf{tr}} \ge \epsilon$.
no code implementations • ICLR 2021 • Dan Hendrycks, Collin Burns, Steven Basart, Andrew Critch, Jerry Li, Dawn Song, Jacob Steinhardt
We show how to assess a language model’s knowledge of basic concepts of morality.
no code implementations • 1 Jan 2021 • Sung Min Park, Kuo-An Wei, Kai Yuanqing Xiao, Jerry Li, Aleksander Madry
We study universal adversarial perturbations and demonstrate that the above picture is more nuanced.
no code implementations • ICLR 2021 • Zeyuan Allen-Zhu, Faeze Ebrahimian, Jerry Li, Dan Alistarh
We study adversary-resilient stochastic distributed optimization, in which $m$ machines can independently compute stochastic gradients, and cooperate to jointly optimize over their local objective functions.
1 code implementation • NeurIPS 2020 • Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Thomas Paine, Sergio Gómez, Konrad Zolna, Rishabh Agarwal, Josh S. Merel, Daniel J. Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matthew Hoffman, Nicolas Heess, Nando de Freitas
We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.
no code implementations • NeurIPS 2021 • Ilias Diakonikolas, Daniel M. Kane, Daniel Kongsgaard, Jerry Li, Kevin Tian
Our algorithm runs in time $\widetilde{O}(ndk)$ for all $k = O(\sqrt{d}) \cup \Omega(d)$, where $n$ is the size of the dataset.
no code implementations • 13 Sep 2020 • Matthew Brennan, Guy Bresler, Samuel B. Hopkins, Jerry Li, Tselil Schramm
Researchers currently use a number of approaches to predict and substantiate information-computation gaps in high-dimensional statistical estimation problems.
2 code implementations • 5 Aug 2020 • Dan Hendrycks, Collin Burns, Steven Basart, Andrew Critch, Jerry Li, Dawn Song, Jacob Steinhardt
We show how to assess a language model's knowledge of basic concepts of morality.
Ranked #1 on Average on hendrycks2020ethics
no code implementations • 4 Aug 2020 • Arun Jambulapati, Jerry Li, Christopher Musco, Aaron Sidford, Kevin Tian
In this paper, we revisit the decades-old problem of how to best improve $\mathbf{A}$'s condition number by left or right diagonal rescaling.
no code implementations • NeurIPS 2020 • Samuel B. Hopkins, Jerry Li, Fred Zhang
In this paper, we provide a meta-problem and a duality theorem that lead to a new unified view on robust and heavy-tailed mean estimation in high dimensions.
no code implementations • 13 Jul 2020 • Ivan Evtimov, Weidong Cui, Ece Kamar, Emre Kiciman, Tadayoshi Kohno, Jerry Li
Machine learning (ML) models deployed in many safety- and business-critical systems are vulnerable to exploitation through adversarial examples.
2 code implementations • 24 Jun 2020 • Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, Nando de Freitas
We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.
no code implementations • NeurIPS 2020 • Jerry Li, Guanghao Ye
Previous work of Cheng et al demonstrated an algorithm that, given $N = \Omega (d^2 / \varepsilon^2)$ samples, achieved a near-optimal error of $O(\varepsilon \log 1 / \varepsilon)$, and moreover, their algorithm ran in time $\widetilde{O}(T(N, d) \log \kappa / \mathrm{poly} (\varepsilon))$, where $T(N, d)$ is the time it takes to multiply a $d \times N$ matrix by its transpose, and $\kappa$ is the condition number of $\Sigma$.
no code implementations • NeurIPS 2020 • Arun Jambulapati, Jerry Li, Kevin Tian
We develop two methods for the following fundamental statistical task: given an $\epsilon$-corrupted set of $n$ samples from a $d$-dimensional sub-Gaussian distribution, return an approximate top eigenvector of the covariance matrix.
1 code implementation • 24 Mar 2020 • Gabriel Dulac-Arnold, Nir Levine, Daniel J. Mankowitz, Jerry Li, Cosmin Paduraru, Sven Gowal, Todd Hester
We believe that an approach that addresses our set of proposed challenges would be readily deployable in a large number of real world problems.
1 code implementation • 24 Mar 2020 • Ilias Diakonikolas, Jerry Li, Anastasia Voloshinov
We study the fundamental problem of fixed design {\em multidimensional segmented regression}: Given noisy samples from a function $f$, promised to be piecewise linear on an unknown set of $k$ rectangles, we want to recover $f$ up to a desired accuracy in mean-squared error.
1 code implementation • NeurIPS 2020 • Sitan Chen, Jerry Li, Ankur Moitra
We revisit the problem of learning from untrusted batches introduced by Qiao and Valiant [QV17].
1 code implementation • ICML 2020 • Greg Yang, Tony Duan, J. Edward Hu, Hadi Salman, Ilya Razenshteyn, Jerry Li
Randomized smoothing is the current state-of-the-art defense with provable robustness against $\ell_2$ adversarial attacks.
no code implementations • 16 Dec 2019 • Sitan Chen, Jerry Li, Zhao Song
In this paper, we give the first algorithm for learning an MLR that runs in time which is sub-exponential in $k$.
no code implementations • 5 Nov 2019 • Sitan Chen, Jerry Li, Ankur Moitra
When $k = 1$ this is the standard robust univariate density estimation setting and it is well-understood that $\Omega (\epsilon)$ error is unavoidable.
1 code implementation • NeurIPS 2019 • Yihe Dong, Samuel B. Hopkins, Jerry Li
In robust mean estimation the goal is to estimate the mean $\mu$ of a distribution on $\mathbb{R}^d$ given $n$ independent samples, an $\varepsilon$-fraction of which have been corrupted by a malicious adversary.
3 code implementations • NeurIPS 2019 • Hadi Salman, Greg Yang, Jerry Li, Pengchuan Zhang, huan zhang, Ilya Razenshteyn, Sebastien Bubeck
In this paper, we employ adversarial training to improve the performance of randomized smoothing.
no code implementations • 14 May 2019 • Yonina C. Eldar, Jerry Li, Cameron Musco, Christopher Musco
In addition to results that hold for any Toeplitz $T$, we further study the important setting when $T$ is close to low-rank, which is often the case in practice.
1 code implementation • NeurIPS 2018 • Brandon Tran, Jerry Li, Aleksander Madry
In this paper, we identify a new property of all known backdoor attacks, which we call \emph{spectral signatures}.
no code implementations • 26 Aug 2018 • Jerry Li
We present a framework for translating unlabeled images from one domain into analog images in another domain.
no code implementations • 1 May 2018 • Gautam Kamath, Jerry Li, Vikrant Singhal, Jonathan Ullman
We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in total variation distance.
no code implementations • NeurIPS 2018 • Dan Alistarh, Zeyuan Allen-Zhu, Jerry Li
This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the $m$ machines which allegedly compute stochastic gradients every iteration, an $\alpha$-fraction are Byzantine, and can behave arbitrarily and adversarially.
1 code implementation • 7 Mar 2018 • Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart
In high dimensions, most machine learning methods are brittle to even a small fraction of structured outliers.
no code implementations • 23 Feb 2018 • Ilias Diakonikolas, Jerry Li, Ludwig Schmidt
We give an algorithm for this learning problem that uses $n = \tilde{O}_d(k/\epsilon^2)$ samples and runs in time $\tilde{O}_d(n)$.
no code implementations • ICLR 2018 • Jerry Li, Aleksander Madry, John Peebles, Ludwig Schmidt
This suggests that such usage of the first order approximation of the discriminator, which is a de-facto standard in all the existing GAN dynamics, might be one of the factors that makes GAN training so challenging in practice.
no code implementations • NeurIPS 2017 • Ilias Diakonikolas, Elena Grigorescu, Jerry Li, Abhiram Natarajan, Krzysztof Onak, Ludwig Schmidt
For the case of structured distributions, such as k-histograms and monotone distributions, we design distributed learning algorithms that achieve significantly better communication guarantees than the naive ones, and obtain tight upper and lower bounds in several regimes.
no code implementations • ICML 2017 • Hantian Zhang, Jerry Li, Kaan Kara, Dan Alistarh, Ji Liu, Ce Zhang
We examine training at reduced precision, both from a theoretical and practical perspective, and ask: is it possible to train models at end-to-end low precision with provable guarantees?
no code implementations • ICML 2018 • Jerry Li, Aleksander Madry, John Peebles, Ludwig Schmidt
While Generative Adversarial Networks (GANs) have demonstrated promising performance on multiple vision tasks, their learning dynamics are not yet well understood, both in theory and in practice.
1 code implementation • 13 Jun 2017 • Dan Alistarh, Justin Kopinsky, Jerry Li, Giorgi Nadiradze
We answer this question, showing that this strategy provides surprisingly strong guarantees: Although the single-choice process, where we always insert and remove from a single randomly chosen queue, has degrading cost, going to infinity as we increase the number of steps, in the two choice process, the expected rank of a removed element is $O( n )$ while the expected worst-case cost is $O( n \log n )$.
Data Structures and Algorithms Distributed, Parallel, and Cluster Computing
no code implementations • 12 Apr 2017 • Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart
We give robust estimators that achieve estimation error $O(\varepsilon)$ in the total variation distance, which is optimal up to a universal constant that is independent of the dimension.
2 code implementations • ICML 2017 • Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart
Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors.
no code implementations • 20 Feb 2017 • Jerry Li
In this paper we initiate the study of whether or not sparse estimation tasks can be performed efficiently in high dimensions, in the robust setting where an $\eps$-fraction of samples are corrupted adversarially.
1 code implementation • 16 Nov 2016 • Hantian Zhang, Jerry Li, Kaan Kara, Dan Alistarh, Ji Liu, Ce Zhang
When applied to linear models together with double sampling, we save up to another 1. 7x in data movement compared with uniform quantization.
2 code implementations • NeurIPS 2017 • Dan Alistarh, Demjan Grubic, Jerry Li, Ryota Tomioka, Milan Vojnovic
In this paper, we propose Quantized SGD (QSGD), a family of compression schemes which allow the compression of gradient updates at each node, while guaranteeing convergence under standard assumptions.
no code implementations • 14 Jul 2016 • Jayadev Acharya, Ilias Diakonikolas, Jerry Li, Ludwig Schmidt
We study the fixed design segmented regression problem: Given noisy samples from a piecewise linear function $f$, we want to recover $f$ up to a desired accuracy in mean-squared error.
2 code implementations • 21 Apr 2016 • Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Ankur Moitra, Alistair Stewart
We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an $\varepsilon$-fraction of the samples.
no code implementations • 3 Jun 2015 • Jerry Li, Ludwig Schmidt
One notion of learning a GMM is proper learning: here, the goal is to find a mixture of $k$ Gaussians $\mathcal{M}$ that is close to the density $f$ of the unknown distribution from which we draw samples.
no code implementations • 1 Jun 2015 • Jayadev Acharya, Ilias Diakonikolas, Jerry Li, Ludwig Schmidt
Let $f$ be the density function of an arbitrary univariate distribution, and suppose that $f$ is $\mathrm{OPT}$-close in $L_1$-distance to an unknown piecewise polynomial function with $t$ interval pieces and degree $d$.
no code implementations • 26 Sep 2013 • Paul Beame, Jerry Li, Sudeepa Roy, Dan Suciu
The best current methods for exactly computing the number of satisfying assignments, or the satisfying probability, of Boolean formulas can be seen, either directly or indirectly, as building 'decision-DNNF' (decision decomposable negation normal form) representations of the input Boolean formulas.