no code implementations • 2 Mar 2025 • Zhiqi Bu, Ruixuan Liu
Differential privacy (DP) is a privacy-preserving paradigm that protects the training data when training deep learning models.
no code implementations • 20 Feb 2025 • Anil Ramakrishna, Yixin Wan, Xiaomeng Jin, Kai-Wei Chang, Zhiqi Bu, Bhanukiran Vinzamuri, Volkan Cevher, Mingyi Hong, Rahul Gupta
Unlearning aims to remove copyrighted, sensitive, or private content from large language models (LLMs) without a full retraining.
no code implementations • 12 Jan 2025 • Shiyun Xu, Zhiqi Bu, Yiliang Zhang, Ian Barnett
Differential learning rate (DLR), a technique that applies different learning rates to different model parameters, has been widely used in deep learning and achieved empirical success via its various forms.
no code implementations • 29 Oct 2024 • Zhiqi Bu, Xiaomeng Jin, Bhanukiran Vinzamuri, Anil Ramakrishna, Kai-Wei Chang, Volkan Cevher, Mingyi Hong
Machine unlearning has been used to remove unwanted knowledge acquired by large language models (LLMs).
no code implementations • 4 Oct 2024 • Xinwei Zhang, Zhiqi Bu, Borja Balle, Mingyi Hong, Meisam Razaviyayn, Vahab Mirrokni
This approach led to the development of DP optimizers that have comparable performance with their non-private counterparts in fine-tuning tasks or in tasks with a small number of training parameters.
no code implementations • 24 Aug 2024 • Xinwei Zhang, Zhiqi Bu, Mingyi Hong, Meisam Razaviyayn
More specifically, by defining the ``frequency domain'' for both the gradient and differential privacy (DP) noise, we have developed a new component, called DOPPLER.
no code implementations • 3 Jul 2024 • Zhiqi Bu, Shiyun Xu
We propose the generalized Newton's method (GeN) -- a Hessian-informed approach that applies to any optimizer such as SGD and Adam, and covers the Newton-Raphson method as a sub-case.
1 code implementation • 11 Jun 2024 • Lu Li, Tianyu Zhang, Zhiqi Bu, Suyuchen Wang, Huan He, Jie Fu, Yonghui Wu, Jiang Bian, Yong Chen, Yoshua Bengio
MAP efficiently identifies a Pareto set of scaling coefficients for merging multiple models, reflecting the trade-offs involved.
2 code implementations • 28 Feb 2024 • Zhiqi Bu, Xinwei Zhang, Mingyi Hong, Sheng Zha, George Karypis
The superior performance of large foundation models relies on the use of massive amounts of high-quality data, which often contain sensitive, private and copyrighted material that requires formal protection.
1 code implementation • 24 Nov 2023 • Xinwei Zhang, Zhiqi Bu, Zhiwei Steven Wu, Mingyi Hong
In our work, we propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC, which not only offers a diminishing utility bound without inducing a constant clipping bias, but more importantly, it allows for an arbitrary choice of clipping threshold that is independent of the problem.
no code implementations • 20 Nov 2023 • Zhiqi Bu, Justin Chiu, Ruixuan Liu, Sheng Zha, George Karypis
Deep learning using large models have achieved great success in a wide range of domains.
no code implementations • 30 Oct 2023 • Zhiqi Bu, Ruixuan Liu, Yu-Xiang Wang, Sheng Zha, George Karypis
Recent advances have substantially improved the accuracy, memory cost, and training speed of differentially private (DP) deep learning, especially on large vision and language models with millions to billions of parameters.
no code implementations • 23 Oct 2023 • Yingyu Lin, Yi-An Ma, Yu-Xiang Wang, Rachel Redberg, Zhiqi Bu
Posterior sampling, i. e., exponential mechanism to sample from the posterior distribution, provides $\varepsilon$-pure differential privacy (DP) guarantees and does not suffer from potentially unbounded privacy breach introduced by $(\varepsilon,\delta)$-approximate DP.
no code implementations • 2 Oct 2023 • Ruixuan Liu, Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis
The success of large neural networks is crucially determined by the availability of data.
no code implementations • 2 May 2023 • Zhiqi Bu, Zongyu Dai, Yiliang Zhang, Qi Long
Multiple imputation (MI) has been widely applied to missing value problems in biomedical, social and econometric research, in order to avoid improper inference in the downstream data analysis.
1 code implementation • 23 Nov 2022 • Zongyu Dai, Zhiqi Bu, Qi Long
Single imputation methods such as matrix completion methods do not adequately account for imputation uncertainty and hence would yield improper statistical inference.
no code implementations • 16 Nov 2022 • Yuan Zhang, Zhiqi Bu
Machine learning models have shone in a variety of domains and attracted increasing attention from both the security and the privacy communities.
no code implementations • 9 Nov 2022 • Zhiqi Bu
Adversarial perturbation plays a significant role in the field of adversarial robustness, which solves a maximization problem over the input data.
2 code implementations • 30 Sep 2022 • Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis
We study the problem of differentially private (DP) fine-tuning of large pre-trained models -- a recent privacy-preserving approach suitable for solving downstream tasks with sensitive data.
2 code implementations • 30 Sep 2022 • Zhiqi Bu, Yu-Xiang Wang, Sheng Zha, George Karypis
Our implementation achieves state-of-the-art (SOTA) accuracy with very small extra cost: on GPT2 and at almost the same memory cost (<1% overhead), BK has 1. 03X the time complexity of the standard training (0. 83X training speed in practice), and 0. 61X the time complexity of the most efficient DP implementation (1. 36X training speed in practice).
no code implementations • 1 Jul 2022 • Changgee Chang, Zhiqi Bu, Qi Long
We provide theoretical investigation for the asymptotic properties of the proposed method for statistical inference as well as differential privacy, and evaluate its performance in simulations and real data analyses in comparison with several recently developed methods.
1 code implementation • 21 May 2022 • Zhiqi Bu, Jialin Mao, Shiyun Xu
Large convolutional neural networks (CNN) can be difficult to train in the differentially private (DP) regime, since the optimization algorithms require a computationally expensive operation, known as the per-sample gradient clipping.
no code implementations • 25 Feb 2022 • Shiyun Xu, Zhiqi Bu, Pratik Chaudhari, Ian J. Barnett
In order to empower NAM with feature selection and improve the generalization, we propose the sparse neural additive models (SNAM) that employ the group sparsity regularization (e. g. Group LASSO), where each feature is learned by a sub-network whose trainable parameters are clustered as a group.
no code implementations • 21 Dec 2021 • Zongyu Dai, Zhiqi Bu, Qi Long
Missing data are present in most real world problems and need careful handling to preserve the prediction accuracy and statistical consistency in the downstream analysis.
no code implementations • 29 Sep 2021 • Zhiqi Bu, Ping Li, Weijie Zhao
In this work, we propose the practical adversarial training with differential privacy (DP-Adv), to combine the backbones from both communities and deliver robust and private models with high accuracy.
no code implementations • 18 Jul 2021 • Qiyiwen Zhang, Zhiqi Bu, Kan Chen, Qi Long
Interestingly, we show a new equivalence between DP-SGD and DP-SGLD, implying that some non-Bayesian DP training naturally allows for uncertainty quantification.
no code implementations • 20 Jun 2021 • Matteo Sordello, Zhiqi Bu, Jinshuo Dong
We then analyze the online setting and provide a faster decaying scheme for the magnitude of the injected noise that also guarantees the convergence of privacy loss.
1 code implementation • 17 Jun 2021 • Harsha Nori, Rich Caruana, Zhiqi Bu, Judy Hanwen Shen, Janardhan Kulkarni
We show that adding differential privacy to Explainable Boosting Machines (EBMs), a recent method for training interpretable ML models, yields state-of-the-art accuracy while protecting privacy.
1 code implementation • 15 Jun 2021 • Zhiqi Bu, Hua Wang, Zongyu Dai, Qi Long
Differentially private (DP) training preserves the data privacy usually at the cost of slower convergence (and thus lower accuracy), as well as more severe mis-calibration than its non-private counterpart.
1 code implementation • 27 May 2021 • Zhiqi Bu, Jason Klusowski, Cynthia Rush, Weijie J. Su
Sorted l1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression.
1 code implementation • 14 Feb 2021 • Yiliang Zhang, Zhiqi Bu
In this paper, we propose two efficient algorithms to design the possibly high-dimensional SLOPE penalty, in order to minimize the mean squared error.
no code implementations • NeurIPS 2021 • Zhiqi Bu, Sivakanth Gopi, Janardhan Kulkarni, Yin Tat Lee, Judy Hanwen Shen, Uthaipon Tantipongpipat
Unlike previous attempts to make DP-SGD faster which work only on a subset of network architectures or use compiler techniques, we propose an algorithmic solution which works for any network in a black-box manner which is the main contribution of this paper.
no code implementations • 1 Jan 2021 • Zhiqi Bu, Sivakanth Gopi, Janardhan Kulkarni, Yin Tat Lee, Uthaipon Tantipongpipat
Differentially Private-SGD (DP-SGD) of Abadi et al. (2016) and its variations are the only known algorithms for private training of large scale neural networks.
1 code implementation • 1 Nov 2020 • Shiyun Xu, Zhiqi Bu
Recent years have witnessed strong empirical performance of over-parameterized neural networks on various tasks and many advances in the theory, e. g. the universal approximation and provable convergence to global minimum.
1 code implementation • 25 Oct 2020 • Zhiqi Bu, Shiyun Xu, Kan Chen
When equipped with efficient optimization algorithms, the over-parameterized neural networks have demonstrated high level of performance even though the loss function is non-convex and non-smooth.
2 code implementations • NeurIPS 2020 • Hua Wang, Yachong Yang, Zhiqi Bu, Weijie J. Su
A fundamental problem in the high-dimensional regression is to understand the tradeoff between type I and type II errors or, equivalently, false discovery rate (FDR) and power in variable selection.
Statistics Theory Information Theory Information Theory Statistics Theory
3 code implementations • 26 Nov 2019 • Zhiqi Bu, Jinshuo Dong, Qi Long, Weijie J. Su
Leveraging the appealing properties of $f$-differential privacy in handling composition and subsampling, this paper derives analytically tractable expressions for the privacy guarantees of both stochastic gradient descent and Adam used in training deep neural networks, without the need of developing sophisticated techniques as [3] did.
1 code implementation • NeurIPS 2019 • Zhiqi Bu, Jason Klusowski, Cynthia Rush, Weijie Su
SLOPE is a relatively new convex optimization procedure for high-dimensional linear regression via the sorted l1 penalty: the larger the rank of the fitted coefficient, the larger the penalty.