Search Results for author: Daogao Liu

Found 24 papers, 4 papers with code

Faster Algorithms for User-Level Private Stochastic Convex Optimization

no code implementations24 Oct 2024 Andrew Lowy, Daogao Liu, Hilal Asi

Existing algorithms for user-level DP SCO are impractical in many large-scale machine learning scenarios because: (i) they make restrictive assumptions on the smoothness parameter of the loss function and require the number of users to grow polynomially with the dimension of the parameter space; or (ii) they are prohibitively slow, requiring at least $(mn)^{3/2}$ gradient computations for smooth losses and $(mn)^3$ computations for non-smooth losses.

Adaptive Batch Size for Privately Finding Second-Order Stationary Points

no code implementations10 Oct 2024 Daogao Liu, Kunal Talwar

There is a gap between finding a first-order stationary point (FOSP) and a second-order stationary point (SOSP) under differential privacy constraints, and it remains unclear whether privately finding an SOSP is more challenging than finding an FOSP.

Improved Sample Complexity for Private Nonsmooth Nonconvex Optimization

no code implementations8 Oct 2024 Guy Kornowski, Daogao Liu, Kunal Talwar

We study differentially private (DP) optimization algorithms for stochastic and empirical objectives which are neither smooth nor convex, and propose methods that return a Goldstein-stationary point with sample complexity bounds that improve on existing works.

Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning

no code implementations20 Jun 2024 Lynn Chua, Badih Ghazi, Yangsibo Huang, Pritish Kamath, Ravi Kumar, Daogao Liu, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang

Large language models (LLMs) have emerged as powerful tools for tackling complex tasks across diverse domains, but they also raise privacy concerns when fine-tuned on sensitive data due to potential memorization.

Language Modeling Language Modelling +2

Private Online Learning via Lazy Algorithms

no code implementations5 Jun 2024 Hilal Asi, Tomer Koren, Daogao Liu, Kunal Talwar

We propose a new transformation that transforms lazy online learning algorithms into private algorithms.

Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions

no code implementations4 Jun 2024 Hilal Asi, Daogao Liu, Kevin Tian

We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a $k^{\text{th}}$-moment bound on the Lipschitz constants of sample functions rather than a uniform bound.

Private Gradient Descent for Linear Regression: Tighter Error Bounds and Instance-Specific Uncertainty Estimation

no code implementations21 Feb 2024 Gavin Brown, Krishnamurthy Dvijotham, Georgina Evans, Daogao Liu, Adam Smith, Abhradeep Thakurta

We provide an improved analysis of standard differentially private gradient descent for linear regression under the squared error loss.

User-level Differentially Private Stochastic Convex Optimization: Efficient Algorithms with Optimal Rates

no code implementations7 Nov 2023 Hilal Asi, Daogao Liu

We study differentially private stochastic convex optimization (DP-SCO) under user-level privacy, where each user may hold multiple data items.

Detecting Pretraining Data from Large Language Models

1 code implementation25 Oct 2023 Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer

Min-K% Prob can be applied without any knowledge about the pretraining corpus or any additional training, departing from previous detection methods that require training a reference model on data that is similar to the pretraining data.

Machine Unlearning

Learning across Data Owners with Joint Differential Privacy

no code implementations25 May 2023 Yangsibo Huang, Haotian Jiang, Daogao Liu, Mohammad Mahdian, Jieming Mao, Vahab Mirrokni

In this paper, we study the setting in which data owners train machine learning models collaboratively under a privacy notion called joint differential privacy [Kearns et al., 2018].

Multi-class Classification

Algorithmic Aspects of the Log-Laplace Transform and a Non-Euclidean Proximal Sampler

no code implementations13 Feb 2023 Sivakanth Gopi, Yin Tat Lee, Daogao Liu, Ruoqi Shen, Kevin Tian

The development of efficient sampling algorithms catering to non-Euclidean geometries has been a challenging endeavor, as discretization techniques which succeed in the Euclidean setting do not readily carry over to more general settings.

ReSQueing Parallel and Private Stochastic Convex Optimization

no code implementations1 Jan 2023 Yair Carmon, Arun Jambulapati, Yujia Jin, Yin Tat Lee, Daogao Liu, Aaron Sidford, Kevin Tian

We give a parallel algorithm obtaining optimization error $\epsilon_{\text{opt}}$ with $d^{1/3}\epsilon_{\text{opt}}^{-2/3}$ gradient oracle query depth and $d^{1/3}\epsilon_{\text{opt}}^{-2/3} + \epsilon_{\text{opt}}^{-2}$ gradient queries in total, assuming access to a bounded-variance stochastic gradient estimator.

Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation

1 code implementation21 Oct 2022 Ziqi Wang, Yuexin Wu, Frederick Liu, Daogao Liu, Le Hou, Hongkun Yu, Jing Li, Heng Ji

However, these data augmentation methods either potentially cause shifts in decision boundaries (representation interpolation), are not expressive enough (token replacement), or introduce too much computational overhead (augmentation with models).

Data Augmentation Diversity +1

Private Convex Optimization in General Norms

no code implementations18 Jul 2022 Sivakanth Gopi, Yin Tat Lee, Daogao Liu, Ruoqi Shen, Kevin Tian

We propose a new framework for differentially private optimization of convex functions which are Lipschitz in an arbitrary norm $\|\cdot\|$.

Private Convex Optimization via Exponential Mechanism

no code implementations1 Mar 2022 Sivakanth Gopi, Yin Tat Lee, Daogao Liu

Furthermore, we show how to implement this mechanism using $\widetilde{O}(n \min(d, n))$ queries to $f_i(x)$ for the DP-SCO where $n$ is the number of samples/users and $d$ is the ambient dimension.

Better Private Algorithms for Correlation Clustering

no code implementations22 Feb 2022 Daogao Liu

In machine learning, correlation clustering is an important problem whose goal is to partition the individuals into groups that correlate with their pairwise similarities as much as possible.

Clustering

Private Non-smooth ERM and SCO in Subquadratic Steps

no code implementations NeurIPS 2021 Janardhan Kulkarni, Yin Tat Lee, Daogao Liu

We study the differentially private Empirical Risk Minimization (ERM) and Stochastic Convex Optimization (SCO) problems for non-smooth convex functions.

Tight lower bounds for Differentially Private ERM

no code implementations29 Sep 2021 Daogao Liu, Zhou Lu

We consider the lower bounds of differentially private ERM for general convex functions.

The Convergence Rate of SGD's Final Iterate: Analysis on Dimension Dependence

no code implementations28 Jun 2021 Daogao Liu, Zhou Lu

The best known lower bounds, however, are worse than the upper bounds by a factor of $\log T$.

Open-Ended Question Answering

The Power of Sampling: Dimension-free Risk Bounds in Private ERM

no code implementations28 May 2021 Yin Tat Lee, Daogao Liu, Zhou Lu

We further construct lower bounds, demonstrating that when gradients are full-rank, there is no separation between the constrained and unconstrained settings.

Private Non-smooth Empirical Risk Minimization and Stochastic Convex Optimization in Subquadratic Steps

no code implementations29 Mar 2021 Janardhan Kulkarni, Yin Tat Lee, Daogao Liu

More precisely, our differentially private algorithm requires $O(\frac{N^{3/2}}{d^{1/8}}+ \frac{N^2}{d})$ gradient queries for optimal excess empirical risk, which is achieved with the help of subsampling and smoothing the function via convolution.

Cannot find the paper you are looking for? You can Submit a new open access paper.