Systematic generalisation with group invariant predictions

We consider situations where the presence of dominant simpler correlations with the target variable in a training set can cause an SGD-trained neural network to be less reliant on more persistently-correlating complex features. When the non-persistent, simpler correlations correspond to non-semantic background factors, a neural network trained on this data can exhibit dramatic failure upon encountering systematic distributional shift, where the correlating background features are recombined with different objects. We perform an empirical study showing that group invariance methods across inferred partitionings of the training set can lead to significant improvements at such test-time situations. We suggest a new invariance penalty, showing with experiments on three synthetic datasets that it can perform better than alternatives. We find that even without assuming access to any systematic-shift validation sets, one can still find improvements over an ERM-trained reference model.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here