Online Learning of Quantum States

Suppose we have many copies of an unknown $n$-qubit state $\rho$. We measure some copies of $\rho$ using a known two-outcome measurement $E_{1}$, then other copies using a measurement $E_{2}$, and so on. At each stage $t$, we generate a current hypothesis $\sigma_{t}$ about the state $\rho$, using the outcomes of the previous measurements. We show that it is possible to do this in a way that guarantees that $|\operatorname{Tr}(E_{i} \sigma_{t}) - \operatorname{Tr}(E_{i}\rho) |$, the error in our prediction for the next measurement, is at least $\varepsilon$ at most $\operatorname{O}\!\left(n / \varepsilon^2 \right) $ times. Even in the "non-realizable" setting---where there could be arbitrary noise in the measurement outcomes---we show how to output hypothesis states that do significantly worse than the best possible states at most $\operatorname{O}\!\left(\sqrt {Tn}\right) $ times on the first $T$ measurements. These results generalize a 2007 theorem by Aaronson on the PAC-learnability of quantum states, to the online and regret-minimization settings. We give three different ways to prove our results---using convex optimization, quantum postselection, and sequential fat-shattering dimension---which have different advantages in terms of parameters and portability.

PDF Abstract NeurIPS 2018 PDF NeurIPS 2018 Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here