Search Results for author: Sitan Chen

Found 42 papers, 4 papers with code

Unrolled denoising networks provably learn optimal Bayesian inference

no code implementations19 Sep 2024 Aayush Karan, Kulin Shah, Sitan Chen, Yonina C. Eldar

In recent years, algorithm unrolling has emerged as deep learning's answer to this age-old question: design a neural network whose layers can in principle simulate iterations of inference algorithms and train on data generated by the unknown prior.

Bayesian Inference Denoising

What does guidance do? A fine-grained analysis in a simple setting

no code implementations19 Sep 2024 Muthu Chidambaram, Khashayar Gatmiry, Sitan Chen, Holden Lee, Jianfeng Lu

The use of guidance in diffusion models was originally motivated by the premise that the guidance-modified score is that of the data distribution tilted by a conditional likelihood raised to some power.

Predicting quantum channels over general product distributions

no code implementations5 Sep 2024 Sitan Chen, Jaume de Dios Pont, Jun-Ting Hsieh, Hsin-Yuan Huang, Jane Lange, Jerry Li

Previously, Huang, Chen, and Preskill proved a surprising result that even if $E$ is arbitrary, this task can be solved in time roughly $n^{O(\log(1/\epsilon))}$, where $\epsilon$ is the target prediction error.

Stabilizer bootstrapping: A recipe for efficient agnostic tomography and magic estimation

no code implementations13 Aug 2024 Sitan Chen, Weiyuan Gong, Qi Ye, Zhihan Zhang

We study the task of agnostic tomography: given copies of an unknown $n$-qubit state $\rho$ which has fidelity $\tau$ with some state in a given class $C$, find a state which has fidelity $\ge \tau - \epsilon$ with $\rho$.

Optimal high-precision shadow estimation

no code implementations18 Jul 2024 Sitan Chen, Jerry Li, Allen Liu

We give the first tight sample complexity bounds for shadow tomography and classical shadows in the regime where the target error is below some sufficiently small inverse polynomial in the dimension of the Hilbert space.

Dimensionality Reduction

Faster Diffusion-based Sampling with Randomized Midpoints: Sequential and Parallel

no code implementations3 Jun 2024 Shivam Gupta, Linda Cai, Sitan Chen

As a byproduct of our methods, for the well-studied problem of log-concave sampling in total variation distance, we give an algorithm and simple analysis achieving dimension dependence $\widetilde O(d^{5/12})$ compared to $\widetilde O(\sqrt{d})$ from prior work.

Learning general Gaussian mixtures with efficient score matching

no code implementations29 Apr 2024 Sitan Chen, Vasilis Kontonis, Kulin Shah

Prior works for this problem either (i) required exponential runtime in the dimension $d$, (ii) placed strong assumptions on the instance (e. g., spherical covariances or clusterability), or (iii) had doubly exponential dependence on the number of components $k$.

Critical windows: non-asymptotic theory for feature emergence in diffusion models

no code implementations3 Mar 2024 Marvin Li, Sitan Chen

Additionally, preliminary experiments on Stable Diffusion suggest critical windows may serve as a useful tool for diagnosing fairness and privacy violations in real-world diffusion models.

Fairness Image Generation

An optimal tradeoff between entanglement and copy complexity for state tomography

no code implementations26 Feb 2024 Sitan Chen, Jerry Li, Allen Liu

In this work, we study tomography in the natural setting where one can make measurements of $t$ copies at a time.

Provably learning a multi-head attention layer

no code implementations6 Feb 2024 Sitan Chen, Yuanzhi Li

In this work, we initiate the study of provably learning a multi-head attention layer from random examples and give the first nontrivial upper and lower bounds for this problem: - Provided $\{\mathbf{W}_i, \mathbf{\Theta}_i\}$ satisfy certain non-degeneracy conditions, we give a $(dk)^{O(m^3)}$-time algorithm that learns $F$ to small error given random labeled examples drawn uniformly from $\{\pm 1\}^{k\times d}$.

Efficient Pauli channel estimation with logarithmic quantum memory

no code implementations25 Sep 2023 Sitan Chen, Weiyuan Gong

Prior work (Chen et al., 2022) proved no-go theorems for this task in the practical regime where one has a limited amount of quantum memory, e. g. any protocol with $\le 0. 99n$ ancilla qubits of quantum memory must make exponentially many measurements, provided it is non-concatenating.

Benchmarking

A faster and simpler algorithm for learning shallow networks

no code implementations24 Jul 2023 Sitan Chen, Shyam Narayanan

We revisit the well-studied problem of learning a linear combination of $k$ ReLU activations given labeled examples drawn from the standard $d$-dimensional Gaussian measure.

Learning Narrow One-Hidden-Layer ReLU Networks

no code implementations20 Apr 2023 Sitan Chen, Zehao Dou, Surbhi Goel, Adam R Klivans, Raghu Meka

We consider the well-studied problem of learning a linear combination of $k$ ReLU activations with respect to a Gaussian distribution on inputs in $d$ dimensions.

Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-Type Samplers

no code implementations6 Mar 2023 Sitan Chen, Giannis Daras, Alexandros G. Dimakis

We develop a framework for non-asymptotic analysis of deterministic samplers used for diffusion generative modeling.

Denoising

Learning to predict arbitrary quantum processes

1 code implementation26 Oct 2022 Hsin-Yuan Huang, Sitan Chen, John Preskill

We present an efficient machine learning (ML) algorithm for predicting any unknown quantum process $\mathcal{E}$ over $n$ qubits.

The Complexity of NISQ

no code implementations13 Oct 2022 Sitan Chen, Jordan Cotler, Hsin-Yuan Huang, Jerry Li

The recent proliferation of NISQ devices has made it imperative to understand their computational power.

Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions

no code implementations22 Sep 2022 Sitan Chen, Sinho Chewi, Jerry Li, Yuanzhi Li, Adil Salim, Anru R. Zhang

We provide theoretical convergence guarantees for score-based generative models (SGMs) such as denoising diffusion probabilistic models (DDPMs), which constitute the backbone of large-scale real-world generative models such as DALL$\cdot$E 2.

Denoising

When Does Adaptivity Help for Quantum State Learning?

no code implementations10 Jun 2022 Sitan Chen, Brice Huang, Jerry Li, Allen Liu, Mark Sellke

We give an adaptive algorithm that outputs a state which is $\gamma$-close in infidelity to $\rho$ using only $\tilde{O}(d^3/\gamma)$ copies, which is optimal for incoherent measurements.

Open-Ended Question Answering

Learning (Very) Simple Generative Models Is Hard

no code implementations31 May 2022 Sitan Chen, Jerry Li, Yuanzhi Li

Motivated by the recent empirical successes of deep generative models, we study the computational complexity of the following unsupervised learning problem.

Tight Bounds for Quantum State Certification with Incoherent Measurements

no code implementations14 Apr 2022 Sitan Chen, Brice Huang, Jerry Li, Allen Liu

When $\sigma$ is the maximally mixed state $\frac{1}{d} I_d$, this is known as mixedness testing.

Learning Polynomial Transformations

no code implementations8 Apr 2022 Sitan Chen, Jerry Li, Yuanzhi Li, Anru R. Zhang

Our first main result is a polynomial-time algorithm for learning quadratic transformations of Gaussians in a smoothed setting.

Tensor Decomposition

Hardness of Noise-Free Learning for Two-Hidden-Layer Neural Networks

no code implementations10 Feb 2022 Sitan Chen, Aravind Gollakota, Adam R. Klivans, Raghu Meka

We give superpolynomial statistical query (SQ) lower bounds for learning two-hidden-layer ReLU networks with respect to Gaussian inputs in the standard (noise-free) model.

PAC learning Vocal Bursts Valence Prediction

Minimax Optimality (Probably) Doesn't Imply Distribution Learning for GANs

no code implementations ICLR 2022 Sitan Chen, Jerry Li, Yuanzhi Li, Raghu Meka

Arguably the most fundamental question in the theory of generative adversarial networks (GANs) is to understand to what extent GANs can actually learn the underlying distribution.

Efficiently Learning One Hidden Layer ReLU Networks From Queries

no code implementations NeurIPS 2021 Sitan Chen, Adam Klivans, Raghu Meka

While the problem of PAC learning neural networks from samples has received considerable attention in recent years, in certain settings like model extraction attacks, it is reasonable to imagine having more than just the ability to observe random labeled examples.

Model extraction PAC learning

Quantum advantage in learning from experiments

1 code implementation1 Dec 2021 Hsin-Yuan Huang, Michael Broughton, Jordan Cotler, Sitan Chen, Jerry Li, Masoud Mohseni, Hartmut Neven, Ryan Babbush, Richard Kueng, John Preskill, Jarrod R. McClean

Quantum technology has the potential to revolutionize how we acquire and process experimental data to learn about the physical world.

Kalman Filtering with Adversarial Corruptions

no code implementations11 Nov 2021 Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau

In a pioneering work, Schick and Mitter gave provable guarantees when the measurement noise is a known infinitesimal perturbation of a Gaussian and raised the important question of whether one can get similar guarantees for large and unknown perturbations.

A Hierarchy for Replica Quantum Advantage

no code implementations10 Nov 2021 Sitan Chen, Jordan Cotler, Hsin-Yuan Huang, Jerry Li

We prove that given the ability to make entangled measurements on at most $k$ replicas of an $n$-qubit state $\rho$ simultaneously, there is a property of $\rho$ which requires at least order $2^n$ measurements to learn.

Exponential separations between learning with and without quantum memory

no code implementations10 Nov 2021 Sitan Chen, Jordan Cotler, Hsin-Yuan Huang, Jerry Li

We study the power of quantum memory for learning properties of quantum systems and dynamics, which is of great importance in physics and chemistry.

Open-Ended Question Answering

Efficiently Learning Any One Hidden Layer ReLU Network From Queries

no code implementations8 Nov 2021 Sitan Chen, Adam R Klivans, Raghu Meka

In this work we give the first polynomial-time algorithm for learning arbitrary one hidden layer neural networks activations provided black-box access to the network.

Model extraction

Toward Instance-Optimal State Certification With Incoherent Measurements

no code implementations25 Feb 2021 Sitan Chen, Jerry Li, Ryan O'Donnell

We revisit the basic problem of quantum state certification: given copies of unknown mixed state $\rho\in\mathbb{C}^{d\times d}$ and the description of a mixed state $\sigma$, decide whether $\sigma = \rho$ or $\|\sigma - \rho\|_{\mathsf{tr}} \ge \epsilon$.

Symmetric Sparse Boolean Matrix Factorization and Applications

no code implementations2 Feb 2021 Sitan Chen, Zhao Song, Runzhou Tao, Ruizhe Zhang

As this problem is hard in the worst-case, we study a natural average-case variant that arises in the context of these reconstruction attacks: $\mathbf{M} = \mathbf{W}\mathbf{W}^{\top}$ for $\mathbf{W}$ a random Boolean matrix with $k$-sparse rows, and the goal is to recover $\mathbf{W}$ up to column permutation.

Tensor Decomposition

What Can Phase Retrieval Tell Us About Private Distributed Learning?

no code implementations ICLR 2021 Sitan Chen, Xiaoxiao Li, Zhao Song, Danyang Zhuo

In this work, we examine the security of InstaHide, a scheme recently proposed by \cite{hsla20} for preserving the security of private datasets in the context of distributed learning.

Retrieval

On InstaHide, Phase Retrieval, and Sparse Matrix Factorization

no code implementations23 Nov 2020 Sitan Chen, Xiaoxiao Li, Zhao Song, Danyang Zhuo

In this work, we examine the security of InstaHide, a scheme recently proposed by [Huang, Song, Li and Arora, ICML'20] for preserving the security of private datasets in the context of distributed learning.

Retrieval

Online and Distribution-Free Robustness: Regression and Contextual Bandits with Huber Contamination

no code implementations8 Oct 2020 Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau

Our approach is based on a novel alternating minimization scheme that interleaves ordinary least-squares with a simple convex program that finds the optimal reweighting of the distribution under a spectral constraint.

Adversarial Robustness Multi-Armed Bandits +1

Learning Deep ReLU Networks Is Fixed-Parameter Tractable

no code implementations28 Sep 2020 Sitan Chen, Adam R. Klivans, Raghu Meka

These results provably cannot be obtained using gradient-based methods and give the first example of a class of efficiently learnable neural networks that gradient descent will fail to learn.

Learning Polynomials of Few Relevant Dimensions

no code implementations28 Apr 2020 Sitan Chen, Raghu Meka

We give an algorithm that learns the polynomial within accuracy $\epsilon$ with sample complexity that is roughly $N = O_{r, d}(n \log^2(1/\epsilon) (\log n)^d)$ and runtime $O_{r, d}(N n^2)$.

Retrieval

Learning Structured Distributions From Untrusted Batches: Faster and Simpler

1 code implementation NeurIPS 2020 Sitan Chen, Jerry Li, Ankur Moitra

We revisit the problem of learning from untrusted batches introduced by Qiao and Valiant [QV17].

Learning Mixtures of Linear Regressions in Subexponential Time via Fourier Moments

no code implementations16 Dec 2019 Sitan Chen, Jerry Li, Zhao Song

In this paper, we give the first algorithm for learning an MLR that runs in time which is sub-exponential in $k$.

Clustering Density Estimation

Efficiently Learning Structured Distributions from Untrusted Batches

no code implementations5 Nov 2019 Sitan Chen, Jerry Li, Ankur Moitra

When $k = 1$ this is the standard robust univariate density estimation setting and it is well-understood that $\Omega (\epsilon)$ error is unavoidable.

Density Estimation

Beyond the Low-Degree Algorithm: Mixtures of Subcubes and Their Applications

no code implementations17 Mar 2018 Sitan Chen, Ankur Moitra

In contrast, as we will show, mixtures of $k$ subcubes are uniquely determined by their degree $2 \log k$ moments and hence provide a useful abstraction for simultaneously achieving the polynomial dependence on $1/\epsilon$ of the classic Occam algorithms for decision trees and the flexibility of the low-degree algorithm in being able to accommodate stochastic transitions.

Learning Theory

Cannot find the paper you are looking for? You can Submit a new open access paper.