Towards Understanding Quality Challenges of the Federated Learning for Neural Networks: A First Look from the Lens of Robustness

5 Jan 2022 · Amin Eslami Abyane, Derui Zhu, Roberto Souza, Lei Ma, Hadi Hemmati ·

Federated learning (FL) is a distributed learning paradigm that preserves users' data privacy while leveraging the entire dataset of all participants. In FL, multiple models are trained independently on the clients and aggregated centrally to update a global model in an iterative process. Although this approach is excellent at preserving privacy, FL still suffers from quality issues such as attacks or byzantine faults. Recent attempts have been made to address such quality challenges on the robust aggregation techniques for FL. However, the effectiveness of state-of-the-art (SOTA) robust FL techniques is still unclear and lacks a comprehensive study. Therefore, to better understand the current quality status and challenges of these SOTA FL techniques in the presence of attacks and faults, we perform a large-scale empirical study to investigate the SOTA FL's quality from multiple angles of attacks, simulated faults (via mutation operators), and aggregation (defense) methods. In particular, we study FL's performance on the image classification tasks and use DNNs as our model type. Furthermore, we perform our study on two generic image datasets and one real-world federated medical image dataset. We also investigate the effect of the proportion of affected clients and the dataset distribution factors on the robustness of FL. After a large-scale analysis with 496 configurations, we find that most mutators on each user have a negligible effect on the final model in the generic datasets, and only one of them is effective in the medical dataset. Furthermore, we show that model poisoning attacks are more effective than data poisoning attacks. Moreover, choosing the most robust FL aggregator depends on the attacks and datasets. Finally, we illustrate that a simple ensemble of aggregators achieves a more robust solution than any single aggregator and is the best choice in 75% of the cases.

PDF Abstract