Sparse Mixture-of-Experts are Domain Generalizable Learners

8 Jun 2022  ·  Bo Li, Yifei Shen, Jingkang Yang, Yezhen Wang, Jiawei Ren, Tong Che, Jun Zhang, Ziwei Liu ·

Human visual perception can easily generalize to out-of-distributed visual data, which is far beyond the capability of modern machine learning models. Domain generalization (DG) aims to close this gap, with existing DG methods mainly focusing on the loss function design. In this paper, we propose to explore an orthogonal direction, i.e., the design of the backbone architecture. It is motivated by an empirical finding that transformer-based models trained with empirical risk minimization (ERM) outperform CNN-based models employing state-of-the-art (SOTA) DG algorithms on multiple DG datasets. We develop a formal framework to characterize a network's robustness to distribution shifts by studying its architecture's alignment with the correlations in the dataset. This analysis guides us to propose a novel DG model built upon vision transformers, namely Generalizable Mixture-of-Experts (GMoE). Extensive experiments on DomainBed demonstrate that GMoE trained with ERM outperforms SOTA DG baselines by a large margin. Moreover, GMoE is complementary to existing DG methods and its performance is substantially improved when trained with DG algorithms.

PDF Abstract

Results from the Paper


Ranked #17 on Domain Generalization on DomainNet (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Domain Generalization DomainNet Hybrid-SF-MoE Average Accuracy 52.0 # 17
Domain Generalization DomainNet GMoE-S/16 Average Accuracy 48.7 # 22
Domain Generalization Office-Home GMoE-S/16 Average Accuracy 74.2 # 19
Domain Generalization PACS GMoE-S/16 Average Accuracy 88.1 # 31
Domain Generalization TerraIncognita GMoE-S/16 Average Accuracy 48.5 # 28
Domain Generalization VLCS GMoE-S/16 Average Accuracy 80.2 # 19

Methods


No methods listed for this paper. Add relevant methods here