Search Results for author: Gautam Kamath

Found 44 papers, 11 papers with code

Private Estimation with Public Data

no code implementations16 Aug 2022 Alex Bie, Gautam Kamath, Vikrant Singhal

We initiate the study of differentially private (DP) estimation with access to a small amount of public data.

Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent

no code implementations6 Jun 2022 Da Yu, Gautam Kamath, Janardhan Kulkarni, Tie-Yan Liu, Jian Yin, Huishuai Zhang

We propose an efficient algorithm to compute privacy guarantees for individual examples when releasing models trained by DP-SGD.

New Lower Bounds for Private Estimation and a Generalized Fingerprinting Lemma

no code implementations17 May 2022 Gautam Kamath, Argyris Mouzakis, Vikrant Singhal

Additionally, using the private Assouad method of Acharya, Sun, and Zhang, we show a tight $\Omega(d/(\alpha^2 \varepsilon))$ lower bound for estimating the mean of a distribution with bounded covariance to $\alpha$-error in $\ell_2$-distance.

Indiscriminate Data Poisoning Attacks on Neural Networks

no code implementations19 Apr 2022 Yiwei Lu, Gautam Kamath, YaoLiang Yu

Data poisoning attacks, in which a malicious adversary aims to influence a model by injecting "poisoned" data into the training process, have attracted significant recent attention.

Data Poisoning

Efficient Mean Estimation with Pure Differential Privacy via a Sum-of-Squares Exponential Mechanism

no code implementations25 Nov 2021 Samuel B. Hopkins, Gautam Kamath, Mahbod Majid

SoS proofs to algorithms is a key theme in numerous recent works in high-dimensional algorithmic statistics -- estimators which apparently require exponential running time but whose analysis can be captured by low-degree Sum of Squares proofs can be automatically turned into polynomial-time algorithms with the same provable guarantees.

Robust Estimation for Random Graphs

no code implementations9 Nov 2021 Jayadev Acharya, Ayush Jain, Gautam Kamath, Ananda Theertha Suresh, Huanyu Zhang

We study the problem of robustly estimating the parameter $p$ of an Erd\H{o}s-R\'enyi random graph on $n$ nodes, where a $\gamma$ fraction of nodes may be adversarially corrupted.

The Role of Adaptive Optimizers for Honest Private Hyperparameter Selection

no code implementations NeurIPS 2021 Shubhankar Mohapatra, Sajin Sasy, Xi He, Gautam Kamath, Om Thakkar

Hyperparameter optimization is a ubiquitous challenge in machine learning, and the performance of a trained model depends crucially upon their effective selection.

BIG-bench Machine Learning Hyperparameter Optimization

A Private and Computationally-Efficient Estimator for Unbounded Gaussians

no code implementations8 Nov 2021 Gautam Kamath, Argyris Mouzakis, Vikrant Singhal, Thomas Steinke, Jonathan Ullman

We give the first polynomial-time, polynomial-sample, differentially private estimator for the mean and covariance of an arbitrary Gaussian distribution $\mathcal{N}(\mu,\Sigma)$ in $\mathbb{R}^d$.

Differentially Private Fine-tuning of Language Models

2 code implementations ICLR 2022 Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang

For example, on the MNLI dataset we achieve an accuracy of $87. 8\%$ using RoBERTa-Large and $83. 5\%$ using RoBERTa-Base with a privacy budget of $\epsilon = 6. 7$.

Text Generation

The Price of Tolerance in Distribution Testing

no code implementations25 Jun 2021 Clément L. Canonne, Ayush Jain, Gautam Kamath, Jerry Li

Specifically, we show the sample complexity to be \[\tilde \Theta\left(\frac{\sqrt{n}}{\varepsilon_2^{2}} + \frac{n}{\log n} \cdot \max \left\{\frac{\varepsilon_1}{\varepsilon_2^2},\left(\frac{\varepsilon_1}{\varepsilon_2^2}\right)^{\!\! 2}\right\}\right),\] providing a smooth tradeoff between the two previously known cases.

Improved Rates for Differentially Private Stochastic Convex Optimization with Heavy-Tailed Data

no code implementations2 Jun 2021 Gautam Kamath, Xingtu Liu, Huanyu Zhang

Finally, we prove nearly-matching lower bounds for private stochastic convex optimization with strongly convex losses and mean estimation, showing new separations between pure and concentrated DP.

On the Sample Complexity of Privately Learning Unbounded High-Dimensional Gaussians

no code implementations19 Oct 2020 Ishaq Aden-Ali, Hassan Ashtiani, Gautam Kamath

These are the first finite sample upper bounds for general Gaussians which do not impose restrictions on the parameters of the distribution.

Enabling Fast Differentially Private SGD via Just-in-Time Compilation and Vectorization

2 code implementations NeurIPS 2021 Pranav Subramani, Nicholas Vadivelu, Gautam Kamath

We also rebuild core parts of TensorFlow Privacy, integrating features from TensorFlow 2 as well as XLA compilation, granting significant memory and runtime improvements over the current release version.

CoinPress: Practical Private Mean and Covariance Estimation

2 code implementations NeurIPS 2020 Sourav Biswas, Yihe Dong, Gautam Kamath, Jonathan Ullman

We present simple differentially private estimators for the mean and covariance of multivariate sub-Gaussian data that are accurate at small sample sizes.

A Primer on Private Statistics

no code implementations30 Apr 2020 Gautam Kamath, Jonathan Ullman

Differentially private statistical estimation has seen a flurry of developments over the last several years.

The Discrete Gaussian for Differential Privacy

1 code implementation NeurIPS 2020 Clément L. Canonne, Gautam Kamath, Thomas Steinke

Specifically, we theoretically and experimentally show that adding discrete Gaussian noise provides essentially the same privacy and accuracy guarantees as the addition of continuous Gaussian noise.

PAPRIKA: Private Online False Discovery Rate Control

1 code implementation27 Feb 2020 Wanrong Zhang, Gautam Kamath, Rachel Cummings

In this work, we study False Discovery Rate (FDR) control in multiple hypothesis testing under the constraint of differential privacy for the sample.

Two-sample testing

Privately Learning Markov Random Fields

no code implementations ICML 2020 Huanyu Zhang, Gautam Kamath, Janardhan Kulkarni, Zhiwei Steven Wu

We consider the problem of learning Markov Random Fields (including the prototypical example, the Ising model) under the constraint of differential privacy.

Locally Private Hypothesis Selection

no code implementations21 Feb 2020 Sivakanth Gopi, Gautam Kamath, Janardhan Kulkarni, Aleksandar Nikolov, Zhiwei Steven Wu, Huanyu Zhang

Absent privacy constraints, this problem requires $O(\log k)$ samples from $p$, and it was recently shown that the same complexity is achievable under (central) differential privacy.

Two-sample testing

Private Mean Estimation of Heavy-Tailed Distributions

no code implementations21 Feb 2020 Gautam Kamath, Vikrant Singhal, Jonathan Ullman

We give new upper and lower bounds on the minimax sample complexity of differentially private mean estimation of distributions with bounded $k$-th moments.

Random Restrictions of High-Dimensional Distributions and Uniformity Testing with Subcube Conditioning

no code implementations17 Nov 2019 Clément L. Canonne, Xi Chen, Gautam Kamath, Amit Levi, Erik Waingarten

We give a nearly-optimal algorithm for testing uniformity of distributions supported on $\{-1, 1\}^n$, which makes $\tilde O (\sqrt{n}/\varepsilon^2)$ queries to a subcube conditional sampling oracle (Bhattacharyya and Chakraborty (2018)).

Differentially Private Algorithms for Learning Mixtures of Separated Gaussians

no code implementations NeurIPS 2019 Gautam Kamath, Or Sheffet, Vikrant Singhal, Jonathan Ullman

Learning the parameters of Gaussian mixture models is a fundamental and widely studied problem with numerous applications.

Private Hypothesis Selection

no code implementations NeurIPS 2019 Mark Bun, Gautam Kamath, Thomas Steinke, Zhiwei Steven Wu

The sample complexity of our basic algorithm is $O\left(\frac{\log m}{\alpha^2} + \frac{\log m}{\alpha \varepsilon}\right)$, representing a minimal cost for privacy when compared to the non-private algorithm.

PAC learning

Private Identity Testing for High-Dimensional Distributions

no code implementations NeurIPS 2020 Clément L. Canonne, Gautam Kamath, Audra McMillan, Jonathan Ullman, Lydia Zakynthinou

In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: Gaussians in $\mathbb{R}^d$ with known covariance and product distributions over $\{\pm 1\}^{d}$.

The Structure of Optimal Private Tests for Simple Hypotheses

no code implementations27 Nov 2018 Clément L. Canonne, Gautam Kamath, Audra McMillan, Adam Smith, Jonathan Ullman

Specifically, we characterize this sample complexity up to constant factors in terms of the structure of $P$ and $Q$ and the privacy level $\varepsilon$, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test.

Change Point Detection Generalization Bounds +1

Anaconda: A Non-Adaptive Conditional Sampling Algorithm for Distribution Testing

no code implementations17 Jul 2018 Gautam Kamath, Christos Tzamos

This is an exponential improvement over the previous best upper bound, and demonstrates that the complexity of the problem in this model is intermediate to the the complexity of the problem in the standard sampling model and the adaptive conditional sampling model.

Privately Learning High-Dimensional Distributions

no code implementations1 May 2018 Gautam Kamath, Jerry Li, Vikrant Singhal, Jonathan Ullman

We present novel, computationally efficient, and differentially private algorithms for two fundamental high-dimensional learning problems: learning a multivariate Gaussian and learning a product distribution over the Boolean hypercube in total variation distance.

Sever: A Robust Meta-Algorithm for Stochastic Optimization

1 code implementation7 Mar 2018 Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart

In high dimensions, most machine learning methods are brittle to even a small fraction of structured outliers.

Stochastic Optimization

INSPECTRE: Privately Estimating the Unseen

1 code implementation ICML 2018 Jayadev Acharya, Gautam Kamath, Ziteng Sun, Huanyu Zhang

We develop differentially private methods for estimating various distributional properties.

Actively Avoiding Nonsense in Generative Models

no code implementations20 Feb 2018 Steve Hanneke, Adam Kalai, Gautam Kamath, Christos Tzamos

A generative model may generate utter nonsense when it is fit to maximize the likelihood of observed data.

Which Distribution Distances are Sublinearly Testable?

no code implementations31 Jul 2017 Constantinos Daskalakis, Gautam Kamath, John Wright

Given samples from an unknown distribution $p$ and a description of a distribution $q$, are $p$ and $q$ close or far?

Robustly Learning a Gaussian: Getting Optimal Error, Efficiently

no code implementations12 Apr 2017 Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart

We give robust estimators that achieve estimation error $O(\varepsilon)$ in the total variation distance, which is optimal up to a universal constant that is independent of the dimension.

Priv'IT: Private and Sample Efficient Identity Testing

1 code implementation29 Mar 2017 Bryan Cai, Constantinos Daskalakis, Gautam Kamath

We develop differentially private hypothesis testing methods for the small sample regime.

Two-sample testing

Being Robust (in High Dimensions) Can Be Practical

2 code implementations ICML 2017 Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart

Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors.

Testing Ising Models

no code implementations9 Dec 2016 Constantinos Daskalakis, Nishanth Dikkala, Gautam Kamath

Given samples from an unknown multivariate distribution $p$, is it possible to distinguish whether $p$ is the product of its marginals versus $p$ being far from every product distribution?

Robust Estimators in High Dimensions without the Computational Intractability

2 code implementations21 Apr 2016 Ilias Diakonikolas, Gautam Kamath, Daniel Kane, Jerry Li, Ankur Moitra, Alistair Stewart

We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an $\varepsilon$-fraction of the samples.

A Size-Free CLT for Poisson Multinomials and its Applications

no code implementations11 Nov 2015 Constantinos Daskalakis, Anindya De, Gautam Kamath, Christos Tzamos

Finally, leveraging the structural properties of the Fourier spectrum of PMDs we show that these distributions can be learned from $O_k(1/\varepsilon^2)$ samples in ${\rm poly}_k(1/\varepsilon)$-time, removing the quasi-polynomial dependence of the running time on $1/\varepsilon$ from the algorithm of Daskalakis, Kamath, and Tzamos.

Optimal Testing for Properties of Distributions

no code implementations NeurIPS 2015 Jayadev Acharya, Constantinos Daskalakis, Gautam Kamath

Given samples from an unknown distribution $p$, is it possible to distinguish whether $p$ belongs to some class of distributions $\mathcal{C}$ versus $p$ being far from every distribution in $\mathcal{C}$?

On the Structure, Covering, and Learning of Poisson Multinomial Distributions

no code implementations30 Apr 2015 Constantinos Daskalakis, Gautam Kamath, Christos Tzamos

We prove a structural characterization of these distributions, showing that, for all $\varepsilon >0$, any $(n, k)$-Poisson multinomial random vector is $\varepsilon$-close, in total variation distance, to the sum of a discretized multidimensional Gaussian and an independent $(\text{poly}(k/\varepsilon), k)$-Poisson multinomial random vector.

A Chasm Between Identity and Equivalence Testing with Conditional Queries

no code implementations26 Nov 2014 Jayadev Acharya, Clément L. Canonne, Gautam Kamath

We answer a question of Chakraborty et al. (ITCS 2013) showing that non-adaptive uniformity testing indeed requires $\Omega(\log n)$ queries in the conditional model.

Faster and Sample Near-Optimal Algorithms for Proper Learning Mixtures of Gaussians

no code implementations4 Dec 2013 Constantinos Daskalakis, Gautam Kamath

The algorithm requires ${O}(\log{N}/\varepsilon^2)$ samples from the unknown distribution and ${O}(N \log N/\varepsilon^2)$ time, which improves previous such results (such as the Scheff\'e estimator) from a quadratic dependence of the running time on $N$ to quasilinear.

Cannot find the paper you are looking for? You can Submit a new open access paper.