Secure Summation via Subset Sums: A New Primitive for Privacy-Preserving Distributed Machine Learning

27 Jun 2019 · Valentin Hartmann, Robert West ·

For population studies or for the training of complex machine learning models, it is often required to gather data from different actors. In these applications, summation is an important primitive: for computing means, counts or mini-batch gradients. In many cases, the data is privacy-sensitive and therefore cannot be collected on a central server. Hence the summation needs to be performed in a distributed and privacy-preserving way. Existing solutions for distributed summation with computational privacy guarantees make trust or connection assumptions - e.g., the existence of a trusted server or peer-to-peer connections between clients - that might not be fulfilled in real world settings. Motivated by these challenges, we propose Secure Summation via Subset Sums (S5), a method for distributed summation that works in the presence of a malicious server and only two honest clients, and without the need for peer-to-peer connections between clients. S5 adds zero-sum noise to clients' messages and shuffles them before sending them to the aggregating server. Our main contribution is a proof that this scheme yields a computational privacy guarantee based on the multidimensional subset sum problem. Our analysis of this problem may be of independent interest for other privacy and cryptography applications.

PDF Abstract