When Covariate-shifted Data Augmentation Increases Test Error And How to Fix It

25 Sep 2019  ·  Sang Michael Xie*, Aditi Raghunathan*, Fanny Yang, John C. Duchi, Percy Liang ·

Empirically, data augmentation sometimes improves and sometimes hurts test error, even when only adding points with labels from the true conditional distribution that the hypothesis class is expressive enough to fit. In this paper, we provide precise conditions under which data augmentation hurts test accuracy for minimum norm estimators in linear regression. To mitigate the failure modes of augmentation, we introduce X-regularization, which uses unlabeled data to regularize the parameters towards the non-augmented estimate. We prove that our new estimator never hurts test error and exhibits significant improvements over adversarial data augmentation on CIFAR-10.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here