no code implementations • 6 Dec 2023 • Haichao Sha, Ruixuan Liu, Yixuan Liu, Hong Chen
We prove that pre-projection enhances the convergence of DP-SGD by reducing the dependence of clipping error and bias to a fraction of the top gradient eigenspace, and in theory, limits cross-client variance to improve the convergence under heterogeneous federation.