Quantifying and Localizing Usable Information Leakage from Neural Network Gradients

28 May 2021 · Fan Mo, Anastasia Borovykh, Mohammad Malekzadeh, Soteris Demetriou, Deniz Gündüz, Hamed Haddadi ·

In collaborative learning, clients keep their data private and communicate only the computed gradients of the deep neural network being trained on their local data. Several recent attacks show that one can still extract private information from the shared network's gradients compromising clients' privacy. In this paper, to quantify the private information leakage from gradients we adopt usable information theory. We focus on two types of private information: original information in data reconstruction attacks and latent information in attribute inference attacks. Furthermore, a sensitivity analysis over the gradients is performed to explore the underlying cause of information leakage and validate the results of the proposed framework. Finally, we conduct numerical evaluations on six benchmark datasets and four well-known deep models. We measure the impact of training hyperparameters, e.g., batches and epochs, as well as potential defense mechanisms, e.g., dropout and differential privacy. Our proposed framework enables clients to localize and quantify the private information leakage in a layer-wise manner, and enables a better understanding of the sources of information leakage in collaborative learning, which can be used by future studies to benchmark new attacks and defense mechanisms.

PDF Abstract