The Importance of Importance Sampling for Deep Budgeted Training

1 Jan 2021  ·  Eric Arazo, Diego Ortego, Paul Albert, Noel O'Connor, Kevin McGuinness ·

Long iterative training processes for Deep Neural Networks (DNNs) are commonly required to achieve state-of-the-art performance in many computer vision tasks. Core-set selection approaches demonstrate that retaining informative samples is important to avoid large drops in accuracy. Moreover, the importance of each sample depends on the DNN state and works on importance sampling aim at dynamically estimating this to speed-up convergence. Importance sampling also might provide benefits to budgeted training regimes, i.e. when limiting the number of training iterations. This work explores this paradigm and how a budget constraint interacts with importance sampling approaches, data augmentation techniques, and learning rate schedules. We show that under budget restrictions, importance sampling approaches do not provide a consistent improvement over uniform sampling. We suggest that, given a specific budget, the best course of action is to disregard the importance and introduce adequate data augmentation. For example, training in CIFAR-10/100 with 30% of the full training budget, a uniform sampling strategy with certain data augmentation surpasses the performance of 100% budget models trained with standard data augmentation. This also performs on par with several importance sampling strategies considered from the state-of-the-art. We conclude from our work that DNNs under budget restrictions benefit greatly from variety in the samples and that finding the right samples to train is not the most effective strategy when balancing high performance with low computational requirements. The code will be released after the review process.

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here