no code implementations • 19 Sep 2024 • Aayush Karan, Kulin Shah, Sitan Chen, Yonina C. Eldar
In recent years, algorithm unrolling has emerged as deep learning's answer to this age-old question: design a neural network whose layers can in principle simulate iterations of inference algorithms and train on data generated by the unknown prior.
no code implementations • 19 Sep 2024 • Muthu Chidambaram, Khashayar Gatmiry, Sitan Chen, Holden Lee, Jianfeng Lu
The use of guidance in diffusion models was originally motivated by the premise that the guidance-modified score is that of the data distribution tilted by a conditional likelihood raised to some power.
no code implementations • 5 Sep 2024 • Sitan Chen, Jaume de Dios Pont, Jun-Ting Hsieh, Hsin-Yuan Huang, Jane Lange, Jerry Li
Previously, Huang, Chen, and Preskill proved a surprising result that even if $E$ is arbitrary, this task can be solved in time roughly $n^{O(\log(1/\epsilon))}$, where $\epsilon$ is the target prediction error.
no code implementations • 13 Aug 2024 • Sitan Chen, Weiyuan Gong, Qi Ye, Zhihan Zhang
We study the task of agnostic tomography: given copies of an unknown $n$-qubit state $\rho$ which has fidelity $\tau$ with some state in a given class $C$, find a state which has fidelity $\ge \tau - \epsilon$ with $\rho$.
no code implementations • 18 Jul 2024 • Sitan Chen, Jerry Li, Allen Liu
We give the first tight sample complexity bounds for shadow tomography and classical shadows in the regime where the target error is below some sufficiently small inverse polynomial in the dimension of the Hilbert space.
no code implementations • 3 Jun 2024 • Shivam Gupta, Linda Cai, Sitan Chen
As a byproduct of our methods, for the well-studied problem of log-concave sampling in total variation distance, we give an algorithm and simple analysis achieving dimension dependence $\widetilde O(d^{5/12})$ compared to $\widetilde O(\sqrt{d})$ from prior work.
no code implementations • 29 Apr 2024 • Sitan Chen, Vasilis Kontonis, Kulin Shah
Prior works for this problem either (i) required exponential runtime in the dimension $d$, (ii) placed strong assumptions on the instance (e. g., spherical covariances or clusterability), or (iii) had doubly exponential dependence on the number of components $k$.
no code implementations • 3 Mar 2024 • Marvin Li, Sitan Chen
Additionally, preliminary experiments on Stable Diffusion suggest critical windows may serve as a useful tool for diagnosing fairness and privacy violations in real-world diffusion models.
no code implementations • 26 Feb 2024 • Sitan Chen, Jerry Li, Allen Liu
In this work, we study tomography in the natural setting where one can make measurements of $t$ copies at a time.
no code implementations • 6 Feb 2024 • Sitan Chen, Yuanzhi Li
In this work, we initiate the study of provably learning a multi-head attention layer from random examples and give the first nontrivial upper and lower bounds for this problem: - Provided $\{\mathbf{W}_i, \mathbf{\Theta}_i\}$ satisfy certain non-degeneracy conditions, we give a $(dk)^{O(m^3)}$-time algorithm that learns $F$ to small error given random labeled examples drawn uniformly from $\{\pm 1\}^{k\times d}$.
no code implementations • 25 Sep 2023 • Sitan Chen, Weiyuan Gong
Prior work (Chen et al., 2022) proved no-go theorems for this task in the practical regime where one has a limited amount of quantum memory, e. g. any protocol with $\le 0. 99n$ ancilla qubits of quantum memory must make exponentially many measurements, provided it is non-concatenating.
no code implementations • 24 Jul 2023 • Sitan Chen, Shyam Narayanan
We revisit the well-studied problem of learning a linear combination of $k$ ReLU activations given labeled examples drawn from the standard $d$-dimensional Gaussian measure.
no code implementations • 20 Apr 2023 • Sitan Chen, Zehao Dou, Surbhi Goel, Adam R Klivans, Raghu Meka
We consider the well-studied problem of learning a linear combination of $k$ ReLU activations with respect to a Gaussian distribution on inputs in $d$ dimensions.
no code implementations • 6 Mar 2023 • Sitan Chen, Giannis Daras, Alexandros G. Dimakis
We develop a framework for non-asymptotic analysis of deterministic samplers used for diffusion generative modeling.
1 code implementation • 26 Oct 2022 • Hsin-Yuan Huang, Sitan Chen, John Preskill
We present an efficient machine learning (ML) algorithm for predicting any unknown quantum process $\mathcal{E}$ over $n$ qubits.
no code implementations • 13 Oct 2022 • Sitan Chen, Jordan Cotler, Hsin-Yuan Huang, Jerry Li
The recent proliferation of NISQ devices has made it imperative to understand their computational power.
no code implementations • 22 Sep 2022 • Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, Anru R. Zhang
We provide theoretical convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL$\cdot$E 2.
no code implementations • 10 Jun 2022 • Sitan Chen, Brice Huang, Jerry Li, Allen Liu, Mark Sellke
We give an adaptive algorithm that outputs a state which is $\gamma$-close in infidelity to $\rho$ using only $\tilde{O}(d^3/\gamma)$ copies, which is optimal for incoherent measurements.
no code implementations • 31 May 2022 • Sitan Chen, Jerry Li, Yuanzhi Li
Motivated by the recent empirical successes of deep generative models, we study the computational complexity of the following unsupervised learning problem.
no code implementations • 14 Apr 2022 • Sitan Chen, Brice Huang, Jerry Li, Allen Liu
When $\sigma$ is the maximally mixed state $\frac{1}{d} I_d$, this is known as mixedness testing.
no code implementations • 8 Apr 2022 • Sitan Chen, Jerry Li, Yuanzhi Li, Anru R. Zhang
Our first main result is a polynomial-time algorithm for learning quadratic transformations of Gaussians in a smoothed setting.
no code implementations • 10 Feb 2022 • Sitan Chen, Aravind Gollakota, Adam R. Klivans, Raghu Meka
We give superpolynomial statistical query (SQ) lower bounds for learning two-hidden-layer ReLU networks with respect to Gaussian inputs in the standard (noise-free) model.
no code implementations • ICLR 2022 • Sitan Chen, Jerry Li, Yuanzhi Li, Raghu Meka
Arguably the most fundamental question in the theory of generative adversarial networks (GANs) is to understand to what extent GANs can actually learn the underlying distribution.
no code implementations • NeurIPS 2021 • Sitan Chen, Adam Klivans, Raghu Meka
While the problem of PAC learning neural networks from samples has received considerable attention in recent years, in certain settings like model extraction attacks, it is reasonable to imagine having more than just the ability to observe random labeled examples.
1 code implementation • 1 Dec 2021 • Hsin-Yuan Huang, Michael Broughton, Jordan Cotler, Sitan Chen, Jerry Li, Masoud Mohseni, Hartmut Neven, Ryan Babbush, Richard Kueng, John Preskill, Jarrod R. McClean
Quantum technology has the potential to revolutionize how we acquire and process experimental data to learn about the physical world.
no code implementations • 11 Nov 2021 • Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
In a pioneering work, Schick and Mitter gave provable guarantees when the measurement noise is a known infinitesimal perturbation of a Gaussian and raised the important question of whether one can get similar guarantees for large and unknown perturbations.
no code implementations • 10 Nov 2021 • Sitan Chen, Jordan Cotler, Hsin-Yuan Huang, Jerry Li
We prove that given the ability to make entangled measurements on at most $k$ replicas of an $n$-qubit state $\rho$ simultaneously, there is a property of $\rho$ which requires at least order $2^n$ measurements to learn.
no code implementations • 10 Nov 2021 • Sitan Chen, Jordan Cotler, Hsin-Yuan Huang, Jerry Li
We study the power of quantum memory for learning properties of quantum systems and dynamics, which is of great importance in physics and chemistry.
no code implementations • 8 Nov 2021 • Sitan Chen, Adam R Klivans, Raghu Meka
In this work we give the first polynomial-time algorithm for learning arbitrary one hidden layer neural networks activations provided black-box access to the network.
no code implementations • 25 Feb 2021 • Sitan Chen, Jerry Li, Ryan O'Donnell
We revisit the basic problem of quantum state certification: given copies of unknown mixed state $\rho\in\mathbb{C}^{d\times d}$ and the description of a mixed state $\sigma$, decide whether $\sigma = \rho$ or $\|\sigma - \rho\|_{\mathsf{tr}} \ge \epsilon$.
no code implementations • 2 Feb 2021 • Sitan Chen, Zhao Song, Runzhou Tao, Ruizhe Zhang
As this problem is hard in the worst-case, we study a natural average-case variant that arises in the context of these reconstruction attacks: $\mathbf{M} = \mathbf{W}\mathbf{W}^{\top}$ for $\mathbf{W}$ a random Boolean matrix with $k$-sparse rows, and the goal is to recover $\mathbf{W}$ up to column permutation.
no code implementations • ICLR 2021 • Sitan Chen, Xiaoxiao Li, Zhao Song, Danyang Zhuo
In this work, we examine the security of InstaHide, a scheme recently proposed by \cite{hsla20} for preserving the security of private datasets in the context of distributed learning.
no code implementations • NeurIPS 2020 • Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
In this paper, we revisit the problem of distribution-independently learning halfspaces under Massart noise with rate $\eta$.
no code implementations • 23 Nov 2020 • Sitan Chen, Xiaoxiao Li, Zhao Song, Danyang Zhuo
In this work, we examine the security of InstaHide, a scheme recently proposed by [Huang, Song, Li and Arora, ICML'20] for preserving the security of private datasets in the context of distributed learning.
no code implementations • 8 Oct 2020 • Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
Our approach is based on a novel alternating minimization scheme that interleaves ordinary least-squares with a simple convex program that finds the optimal reweighting of the distribution under a spectral constraint.
no code implementations • 28 Sep 2020 • Sitan Chen, Adam R. Klivans, Raghu Meka
These results provably cannot be obtained using gradient-based methods and give the first example of a class of efficiently learnable neural networks that gradient descent will fail to learn.
1 code implementation • 8 Jun 2020 • Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
In particular, we study the problem of learning halfspaces under Massart noise with rate $\eta$.
no code implementations • 28 Apr 2020 • Sitan Chen, Raghu Meka
We give an algorithm that learns the polynomial within accuracy $\epsilon$ with sample complexity that is roughly $N = O_{r, d}(n \log^2(1/\epsilon) (\log n)^d)$ and runtime $O_{r, d}(N n^2)$.
1 code implementation • NeurIPS 2020 • Sitan Chen, Jerry Li, Ankur Moitra
We revisit the problem of learning from untrusted batches introduced by Qiao and Valiant [QV17].
no code implementations • 16 Dec 2019 • Sitan Chen, Jerry Li, Zhao Song
In this paper, we give the first algorithm for learning an MLR that runs in time which is sub-exponential in $k$.
no code implementations • 5 Nov 2019 • Sitan Chen, Jerry Li, Ankur Moitra
When $k = 1$ this is the standard robust univariate density estimation setting and it is well-understood that $\Omega (\epsilon)$ error is unavoidable.
no code implementations • 17 Mar 2018 • Sitan Chen, Ankur Moitra
In contrast, as we will show, mixtures of $k$ subcubes are uniquely determined by their degree $2 \log k$ moments and hence provide a useful abstraction for simultaneously achieving the polynomial dependence on $1/\epsilon$ of the classic Occam algorithms for decision trees and the flexibility of the low-degree algorithm in being able to accommodate stochastic transitions.