Learning Diverse Representations for Fast Adaptation to Distribution Shift

12 Jun 2020  ·  Daniel Pace, Alessandra Russo, Murray Shanahan ·

The i.i.d. assumption is a useful idealization that underpins many successful approaches to supervised machine learning. However, its violation can lead to models that learn to exploit spurious correlations in the training data, rendering them vulnerable to adversarial interventions, undermining their reliability, and limiting their practical application. To mitigate this problem, we present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task. We propose a notion of diversity based on minimizing the conditional total correlation of final layer representations across models given the label, which we approximate using a variational estimator and minimize using adversarial training. To demonstrate our framework's ability to facilitate rapid adaptation to distribution shift, we train a number of simple classifiers from scratch on the frozen outputs of our models using a small amount of data from the shifted distribution. Under this evaluation protocol, our framework significantly outperforms a baseline trained using the empirical risk minimization principle.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here