no code implementations • 10 Oct 2023 • Davin Choo, Joy Qiping Yang, Arnab Bhattacharyya, Clément L. Canonne
We establish finite-sample guarantees for efficient proper learning of bounded-degree polytrees, a rich class of high-dimensional probability distributions and a subclass of Bayesian networks, a widely-studied type of graphical model.
no code implementations • 2 Sep 2023 • Yiyang Huang, Clément L. Canonne
We consider the formulation of "machine unlearning" of Sekhari, Acharya, Kamath, and Suresh (NeurIPS 2021), which formalizes the so-called "right to be forgotten" by requiring that a trained model, upon request, should be able to "unlearn" a number of points from the training data, as if they had never been included in the first place.
no code implementations • 13 Apr 2023 • Vipul Arora, Arnab Bhattacharyya, Clément L. Canonne, Joy Qiping Yang
This paper considers the problem of testing the maximum in-degree of the Bayes net underlying an unknown probability distribution $P$ over $\{0, 1\}^n$, given sample access to $P$.
no code implementations • 14 Feb 2023 • Clément L. Canonne, Ziteng Sun, Ananda Theertha Suresh
We study the problem of discrete distribution estimation in KL divergence and provide concentration bounds for the Laplace estimator.
no code implementations • 14 Jul 2022 • Clément L. Canonne, Ilias Diakonikolas, Daniel M. Kane, Sihan Liu
We investigate the problem of testing whether a discrete probability distribution over an ordered domain is a histogram on a specified number of bins.
no code implementations • 8 Jul 2022 • Praneeth Vepakomma, Mohammad Mohammadi Amiri, Clément L. Canonne, Ramesh Raskar, Alex Pentland
We introduce $\pi$-test, a privacy-preserving algorithm for testing statistical independence between data distributed across multiple parties.
no code implementations • 16 May 2022 • Anand Jerry George, Clément L. Canonne
Our results show that the complexity of testing in these two settings significantly increases under robustness constraints.
no code implementations • 19 Apr 2022 • Arnab Bhattacharyya, Clément L. Canonne, Joy Qiping Yang
We study the following independence testing problem: given access to samples from a distribution $P$ over $\{0, 1\}^n$, decide whether $P$ is a product distribution or whether it is $\varepsilon$-far in total variation distance from any product distribution.
no code implementations • 14 Mar 2022 • Jayadev Acharya, Clément L. Canonne, Ziteng Sun, Himanshu Tyagi
Without sparsity assumptions, it has been established that interactivity cannot improve the minimax rates of estimation under these information constraints.
no code implementations • 20 Aug 2021 • Clément L. Canonne, Hongyi Lyu
Uniformity testing, or testing whether independent observations are uniformly distributed, is the prototypical question in distribution testing.
no code implementations • 25 Jun 2021 • Clément L. Canonne, Ayush Jain, Gautam Kamath, Jerry Li
Specifically, we show the sample complexity to be \[\tilde \Theta\left(\frac{\sqrt{n}}{\varepsilon_2^{2}} + \frac{n}{\log n} \cdot \max \left\{\frac{\varepsilon_1}{\varepsilon_2^2},\left(\frac{\varepsilon_1}{\varepsilon_2^2}\right)^{\!\! 2}\right\}\right),\] providing a smooth tradeoff between the two previously known cases.
no code implementations • 21 Jul 2020 • Jayadev Acharya, Clément L. Canonne, Yu-Han Liu, Ziteng Sun, Himanshu Tyagi
We study the role of interactivity in distributed statistical inference under information constraints, e. g., communication constraints and local differential privacy.
2 code implementations • NeurIPS 2020 • Clément L. Canonne, Gautam Kamath, Thomas Steinke
Specifically, we theoretically and experimentally show that adding discrete Gaussian noise provides essentially the same privacy and accuracy guarantees as the addition of continuous Gaussian noise.
no code implementations • 17 Nov 2019 • Clément L. Canonne, Xi Chen, Gautam Kamath, Amit Levi, Erik Waingarten
We give a nearly-optimal algorithm for testing uniformity of distributions supported on $\{-1, 1\}^n$, which makes $\tilde O (\sqrt{n}/\varepsilon^2)$ queries to a subcube conditional sampling oracle (Bhattacharyya and Chakraborty (2018)).
no code implementations • 20 Jul 2019 • Jayadev Acharya, Clément L. Canonne, Yanjun Han, Ziteng Sun, Himanshu Tyagi
We study goodness-of-fit of discrete distributions in the distributed setting, where samples are divided between multiple users who can only release a limited amount of information about their samples due to various information constraints.
no code implementations • 2 Jul 2019 • Clément L. Canonne, Anindya De, Rocco A. Servedio
We give a range of efficient algorithms and hardness results for this problem, focusing on the case when $f$ is a low-degree polynomial threshold function (PTF).
no code implementations • NeurIPS 2020 • Clément L. Canonne, Gautam Kamath, Audra McMillan, Jonathan Ullman, Lydia Zakynthinou
In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: Gaussians in $\mathbb{R}^d$ with known covariance and product distributions over $\{\pm 1\}^{d}$.
no code implementations • 20 May 2019 • Jayadev Acharya, Clément L. Canonne, Himanshu Tyagi
We propose a general-purpose simulate-and-infer strategy that uses only private-coin communication protocols and is sample-optimal for distribution learning.
no code implementations • 30 Dec 2018 • Jayadev Acharya, Clément L. Canonne, Himanshu Tyagi
Underlying our bounds is a characterization of the contraction in chi-square distances between the observed distributions of the samples when information constraints are placed.
no code implementations • 27 Nov 2018 • Clément L. Canonne, Gautam Kamath, Audra McMillan, Adam Smith, Jonathan Ullman
Specifically, we characterize this sample complexity up to constant factors in terms of the structure of $P$ and $Q$ and the privacy level $\varepsilon$, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test.
no code implementations • 7 Aug 2018 • Jayadev Acharya, Clément L. Canonne, Cody Freitag, Himanshu Tyagi
We are concerned with two settings: First, when we insist on using an already deployed, general-purpose locally differentially private mechanism such as the popular RAPPOR or the recently introduced Hadamard Response for collecting data, and must build our tests based on the data collected via this mechanism; and second, when no such restriction is imposed, and we can design a bespoke mechanism specifically for testing.
no code implementations • 19 Apr 2018 • Jayadev Acharya, Clément L. Canonne, Himanshu Tyagi
Nonetheless, we present a Las Vegas algorithm that simulates a single sample from the unknown distribution using $O(k/2^\ell)$ samples in expectation.
no code implementations • 1 Sep 2016 • Clément L. Canonne, Elena Grigorescu, Siyao Guo, Akash Kumar, Karl Wimmer
Our results include the following: - We demonstrate a separation between testing $k$-monotonicity and testing monotonicity, on the hypercube domain $\{0, 1\}^d$, for $k\geq 3$; - We demonstrate a separation between testing and learning on $\{0, 1\}^d$, for $k=\omega(\log d)$: testing $k$-monotonicity can be performed with $2^{O(\sqrt d \cdot \log d\cdot \log{1/\varepsilon})}$ queries, while learning $k$-monotone functions requires $2^{\Omega(k\cdot \sqrt d\cdot{1/\varepsilon})}$ queries (Blais et al. (RANDOM 2015)).
no code implementations • 26 Nov 2014 • Jayadev Acharya, Clément L. Canonne, Gautam Kamath
We answer a question of Chakraborty et al. (ITCS 2013) showing that non-adaptive uniformity testing indeed requires $\Omega(\log n)$ queries in the conditional model.
no code implementations • 30 Oct 2014 • Eric Blais, Clément L. Canonne, Igor C. Oliveira, Rocco A. Servedio, Li-Yang Tan
In this paper we study the structure of Boolean functions in terms of the minimum number of negations in any circuit computing them, a complexity measure that interpolates between monotone functions and the class of all functions.