Dynamic Differential-Privacy Preserving SGD
The vanilla Differentially-Private Stochastic Gradient Descent (DP-SGD), including DP-Adam and other variants, ensures the privacy of training data by uniformly distributing privacy costs across training steps. The equivalent privacy costs controlled by maintaining the same gradient clipping thresholds and noise powers in each step result in unstable updates and a lower model accuracy when compared to the non-DP counterpart. In this paper, we propose the dynamic DP-SGD (along with dynamic DP-Adam, and others) to reduce the performance loss gap while maintaining privacy by dynamically adjusting clipping thresholds and noise powers while adhering to a total privacy budget constraint. Extensive experiments on a variety of deep learning tasks, including image classification, natural language processing, and federated learning, demonstrate that the proposed dynamic DP-SGD algorithm stabilizes updates and, as a result, significantly improves model accuracy in the strong privacy protection region when compared to the vanilla DP-SGD. We also conduct theoretical analysis to better understand the privacy-utility trade-off with dynamic DP-SGD, as well as to learn why Dynamic DP-SGD can outperform vanilla DP-SGD.
PDF Abstract