Automatic Data Augmentation via Invariance-Constrained Learning

29 Sep 2022  ·  Ignacio Hounie, Luiz F. O. Chamon, Alejandro Ribeiro ·

Underlying data structures, such as symmetries or invariances to transformations, are often exploited to improve the solution of learning tasks. However, embedding these properties in models or learning algorithms can be challenging and computationally intensive. Data augmentation, on the other hand, induces these symmetries during training by applying multiple transformations to the input data. Despite its ubiquity, its effectiveness depends on the choices of which transformations to apply, when to do so, and how often. In fact, there is both empirical and theoretical evidence that the indiscriminate use of data augmentation can introduce biases that outweigh its benefits. This work tackles these issues by automatically adapting the data augmentation while solving the learning task. To do so, it formulates data augmentation as an invariance-constrained learning problem and leverages Monte Carlo Markov Chain (MCMC) sampling to solve it. The result is a practical algorithm that not only does away with a priori searches for augmentation distributions, but also dynamically controls if and when data augmentation is applied. Our experiments illustrate the performance of this method, which achieves state-of-the-art results in automatic data augmentation benchmarks for CIFAR datasets. Furthermore, this approach can be used to gather insights on the actual symmetries underlying a learning task.

PDF Abstract

Results from the Paper


 Ranked #1 on Image Classification on SVHN (Percentage correct metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Image Classification CIFAR-10 Wide-ResNet-28-10 Percentage correct 97.85 # 61
Image Classification CIFAR-10 Wide-ResNet-40-2 Percentage correct 97.05 # 88
Image Classification CIFAR-100 Wide-ResNet-40-2 Percentage correct 81.19 # 118
Image Classification CIFAR-100 Wide-ResNet-28-10 Percentage correct 84.89 # 74
Image Classification SVHN Wide-ResNet-28-10 Percentage correct 98.15 # 1

Methods


No methods listed for this paper. Add relevant methods here