Exploring the Loss Landscape in Neural Architecture Search

6 May 2020  ·  Colin White, Sam Nolen, Yash Savani ·

Neural architecture search (NAS) has seen a steep rise in interest over the last few years. Many algorithms for NAS consist of searching through a space of architectures by iteratively choosing an architecture, evaluating its performance by training it, and using all prior evaluations to come up with the next choice. The evaluation step is noisy - the final accuracy varies based on the random initialization of the weights. Prior work has focused on devising new search algorithms to handle this noise, rather than quantifying or understanding the level of noise in architecture evaluations. In this work, we show that (1) the simplest hill-climbing algorithm is a powerful baseline for NAS, and (2), when the noise in popular NAS benchmark datasets is reduced to a minimum, hill-climbing to outperforms many popular state-of-the-art algorithms. We further back up this observation by showing that the number of local minima is substantially reduced as the noise decreases, and by giving a theoretical characterization of the performance of local search in NAS. Based on our findings, for NAS research we suggest (1) using local search as a baseline, and (2) denoising the training pipeline when possible.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Neural Architecture Search NAS-Bench-201, ImageNet-16-120 Local search Accuracy (Test) 46.38 # 3
Search time (s) 151200 # 29