Demystifying Hyperparameter Optimization in Federated Learning

29 Sep 2021 · Syed Zawad, Jun Yi, Minjia Zhang, Cheng Li, Feng Yan, Yuxiong He ·

Federated Learning (FL) is a new machine learning paradigm that enables training models collaboratively across clients without sharing private data. In FL, data is non-uniformly distributed among clients (i.e., data heterogeneity) and cannot be balanced nor monitored like in conventional ML. Such data heterogeneity and privacy requirements bring unique challenges for learning hyperparameter optimization as the training dynamics change across clients even within the same training round and they are difficult to measure due to privacy constraints. State-of-the-art frameworks in FL focus on developing better aggregation algorithms and policies with the aim of mitigating these challenges. However, almost all existing FL systems adopt a ``global'' tuning method that uses a single set of learning hyperparameters across all the clients, regardless of their underlying data distributions. Our study shows that such a widely adopted global tuning method is not suitable for FL due to its data heterogeneity-oblivious nature. We demonstrate that the data quantity and distribution of the clients have a significant impact on the choice of hyperparameters, making it necessary to have customized tuning for each client. Based on these observations, we propose a first of its kind heterogeneity-aware hyperparameter optimization methodology, FedTune, that adopts a proxy data based hyperparameter customization approach to address the privacy and tuning cost challenges. Together with a Bayesian strengthened tuner, the proposed customized tuning approach is effective, lightweight, and privacy preserving. Extensive evaluation demonstrates that FedTune can achieve up to 7/4/4/6% better accuracy than the widely adopted globally tuned method for popular FL benchmarks FEMNIST, Cifar100, Cifar10, and Fashion-MNIST respectively.

PDF Abstract