Byzantine-Robust Variance-Reduced Federated Learning over Distributed Non-i.i.d. Data

17 Sep 2020  ·  Jie Peng, Zhaoxian Wu, Qing Ling, Tianyi Chen ·

We consider the federated learning problem where data on workers are not independent and identically distributed (i.i.d.). During the learning process, an unknown number of Byzantine workers may send malicious messages to the central node, leading to remarkable learning error. Most of the Byzantine-robust methods address this issue by using robust aggregation rules to aggregate the received messages, but rely on the assumption that all the regular workers have i.i.d. data, which is not the case in many federated learning applications. In light of the significance of reducing stochastic gradient noise for mitigating the effect of Byzantine attacks, we use a resampling strategy to reduce the impact of both inner variation (that describes the sample heterogeneity on every regular worker) and outer variation (that describes the sample heterogeneity among the regular workers), along with a stochastic average gradient algorithm to gradually eliminate the inner variation. The variance-reduced messages are then aggregated with a robust geometric median operator. We prove that the proposed method reaches a neighborhood of the optimal solution at a linear convergence rate and the learning error is determined by the number of Byzantine workers. Numerical experiments corroborate the theoretical results and show that the proposed method outperforms the state-of-the-arts in the non-i.i.d. setting.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here