no code implementations • 7 Nov 2024 • Raef Bassily, Cristóbal Guzmán, Michael Menart
We first consider Lipschitz convex-concave SSPs in the $\ell_p/\ell_q$ setup, $p, q\in[1, 2]$.
no code implementations • 6 Mar 2024 • Enayat Ullah, Michael Menart, Raef Bassily, Cristóbal Guzmán, Raman Arora
We also study PA-DP supervised learning with \textit{unlabeled} public samples.
no code implementations • 29 Feb 2024 • Xinyu Zhou, Raef Bassily
We first present a new algorithm that achieves excess worst-group population risk of $\tilde{O}(\frac{p\sqrt{d}}{K\epsilon} + \sqrt{\frac{p}{K}})$, where $K$ is the total number of samples drawn from all groups and $d$ is the problem dimension.
no code implementations • 22 Nov 2023 • Michael Menart, Enayat Ullah, Raman Arora, Raef Bassily, Cristóbal Guzmán
We further show that, without assuming the KL condition, the same gradient descent algorithm can achieve fast convergence to a stationary point when the gradient stays sufficiently large during the run of the algorithm.
no code implementations • 15 Jun 2023 • Raef Bassily, Corinna Cortes, Anqi Mao, Mehryar Mohri
This is the modern problem of supervised domain adaptation from a public source to a private target domain.
no code implementations • 24 Feb 2023 • Raef Bassily, Cristóbal Guzmán, Michael Menart
We show that convex-concave Lipschitz stochastic saddle point problems (also known as stochastic minimax optimization) can be solved under the constraint of $(\epsilon,\delta)$-differential privacy with \emph{strong (primal-dual) gap} rate of $\tilde O\big(\frac{1}{\sqrt{n}} + \frac{\sqrt{d}}{n\epsilon}\big)$, where $n$ is the dataset size and $d$ is the dimension of the problem.
no code implementations • 12 Aug 2022 • Raef Bassily, Mehryar Mohri, Ananda Theertha Suresh
A key problem in a variety of applications is that of domain adaptation from a public source domain, for which a relatively large amount of labeled data with no privacy constraints is at one's disposal, to a private target domain, for which a private sample is available with very few or no labeled data.
no code implementations • 2 Jun 2022 • Raman Arora, Raef Bassily, Tomás González, Cristóbal Guzmán, Michael Menart, Enayat Ullah
We provide a new efficient algorithm that finds an $\tilde{O}\big(\big[\frac{\sqrt{d}}{n\varepsilon}\big]^{2/3}\big)$-stationary point in the finite-sum setting, where $n$ is the number of samples.
no code implementations • 6 May 2022 • Raman Arora, Raef Bassily, Cristóbal Guzmán, Michael Menart, Enayat Ullah
For this case, we close the gap in the existing work and show that the optimal rate is (up to log factors) $\Theta\left(\frac{\Vert w^*\Vert}{\sqrt{n}} + \min\left\{\frac{\Vert w^*\Vert}{\sqrt{n\epsilon}},\frac{\sqrt{\text{rank}}\Vert w^*\Vert}{n\epsilon}\right\}\right)$, where $\text{rank}$ is the rank of the design matrix.
no code implementations • 21 Apr 2022 • Raef Bassily, Mehryar Mohri, Ananda Theertha Suresh
For the family of linear hypotheses, we give a pure DP learning algorithm that benefits from relative deviation margin guarantees, as well as an efficient DP learning algorithm with margin guarantees.
no code implementations • NeurIPS 2021 • Raef Bassily, Cristóbal Guzmán, Michael Menart
For the $\ell_1$-case with smooth losses and polyhedral constraint, we provide the first nearly dimension independent rate, $\tilde O\big(\frac{\log^{2/3}{d}}{{(n\varepsilon)^{1/3}}}\big)$ in linear time.
no code implementations • 1 Mar 2021 • Raef Bassily, Cristóbal Guzmán, Anupama Nandi
For $2 < p \leq \infty$, we show that existing linear-time constructions for the Euclidean setup attain a nearly optimal excess risk in the low-dimensional regime.
no code implementations • NeurIPS 2020 • Raef Bassily, Shay Moran, Anupama Nandi
Inspired by the above example, we consider a model in which the population $\mathcal{D}$ is a mixture of two sub-populations: a private sub-population $\mathcal{D}_{\sf priv}$ of private and sensitive data, and a public sub-population $\mathcal{D}_{\sf pub}$ of data with no privacy concerns.
no code implementations • NeurIPS 2020 • Raef Bassily, Vitaly Feldman, Cristóbal Guzmán, Kunal Talwar
Our work is the first to address uniform stability of SGD on {\em nonsmooth} convex losses.
no code implementations • ICML 2020 • Raef Bassily, Albert Cheu, Shay Moran, Aleksandar Nikolov, Jonathan Ullman, Zhiwei Steven Wu
In comparison, with only private samples, this problem cannot be solved even for simple query classes with VC-dimension one, and without any private samples, a larger public sample of size $d/\alpha^2$ is needed.
no code implementations • NeurIPS 2019 • Noga Alon, Raef Bassily, Shay Moran
We consider learning problems where the training set consists of two types of examples: private and public.
no code implementations • NeurIPS 2019 • Raef Bassily, Vitaly Feldman, Kunal Talwar, Abhradeep Thakurta
A long line of existing work on private convex optimization focuses on the empirical loss and derives asymptotically tight bounds on the excess empirical loss.
no code implementations • 31 Jul 2019 • Anupama Nandi, Raef Bassily
We formally study this problem in the agnostic PAC model and derive a new upper bound on the private sample complexity.
no code implementations • NeurIPS 2018 • Raef Bassily, Abhradeep Guha Thakurta, Om Dipakbhai Thakkar
In the PAC model, we analyze our construction and prove upper bounds on the sample complexity for both the realizable and the non-realizable cases.
no code implementations • 6 Nov 2018 • Raef Bassily, Mikhail Belkin, Siyuan Ma
Large over-parametrized models learned via stochastic gradient descent (SGD) methods have become a key element in modern machine learning.
no code implementations • 5 Oct 2018 • Raef Bassily
We study the problem of estimating a set of $d$ linear queries with respect to some unknown distribution $\mathbf{p}$ over a domain $\mathcal{J}=[J]$ based on a sensitive data set of $n$ individuals under the constraint of local differential privacy.
no code implementations • 14 Mar 2018 • Raef Bassily, Om Thakkar, Abhradeep Thakurta
We provide a new technique to boost the average-case stability properties of learning algorithms to strong (worst-case) stability properties, and then exploit them to obtain private classification algorithms.
no code implementations • ICML 2018 • Siyuan Ma, Raef Bassily, Mikhail Belkin
We show that there is a critical batch size $m^*$ such that: (a) SGD iteration with mini-batch size $m\leq m^*$ is nearly equivalent to $m$ iterations of mini-batch size $1$ (\emph{linear scaling regime}).
no code implementations • 14 Oct 2017 • Raef Bassily, Shay Moran, Ido Nachum, Jonathan Shafer, Amir Yehudayoff
We discuss an approach that allows us to prove upper bounds on the amount of information that algorithms reveal about their inputs, and also provide a lower bound by showing a simple concept class for which every (possibly randomized) empirical risk minimizer must reveal a lot of information.
no code implementations • 12 Apr 2016 • Raef Bassily, Yoav Freund
We show that typical stability can control generalization error in adaptive data analysis even when the samples in the dataset are not necessarily independent and when queries to be computed are not necessarily of bounded-sensitivity as long as the results of the queries over the dataset (i. e., the computed statistics) follow a distribution with a "light" tail.
no code implementations • 8 Nov 2015 • Raef Bassily, Kobbi Nissim, Adam Smith, Thomas Steinke, Uri Stemmer, Jonathan Ullman
Specifically, suppose there is an unknown distribution $\mathbf{P}$ and a set of $n$ independent samples $\mathbf{x}$ is drawn from $\mathbf{P}$.
no code implementations • 18 Apr 2015 • Raef Bassily, Adam Smith
Moreover, we show that this much error is necessary, regardless of computational efficiency, and even for the simple setting where only one item appears with significant frequency in the data set.
no code implementations • 16 Mar 2015 • Raef Bassily, Adam Smith, Thomas Steinke, Jonathan Ullman
However, generalization error is typically bounded in a non-adaptive model, where all questions are specified before the dataset is drawn.
1 code implementation • 27 May 2014 • Raef Bassily, Adam Smith, Abhradeep Thakurta
We provide a separate set of algorithms and matching lower bounds for the setting in which the loss functions are known to also be strongly convex.