Fast MNAS: Uncertainty-aware Neural Architecture Search with Lifelong Learning

1 Jan 2021 · Jihao Liu, Yangting Sun, Ming Zhang, Boxiao Liu, Yu Liu ·

Sampling-based neural architecture search (NAS) always guarantees better convergence yet suffers from huge computational resources compared with gradient-based approaches, due to the rollout bottleneck -- exhaustive training for each sampled generation on proxy tasks. This work provides a general pipeline to accelerate the convergence of the rollout process as well as the RL learning process in sampling-based NAS. It is motivated by the interesting observation that both the architecture and the parameter knowledge can be transferred between different experiments and even different tasks. We first introduce an uncertainty-aware critic (value function) in PPO to utilize the architecture knowledge in previous experiments, which stabilizes the training process and reduces the searching time by 4 times. Further, a life-long knowledge pool together with a block similarity function is proposed to utilize the lifelong parameter knowledge and reduces the searching time by 2 times. It is the first to introduce block-level weight sharing in RL-based NAS. The block similarity function guarantees a 100% hitting ratio with strict fairness. Besides, we show a simply designed off-policy correction factor that enables 'replay buffer' in RL optimization and further reduces half of the searching time. Experiments on the MNAS search space show the proposed FNAS accelerates standard RL-based NAS process by $\sim$10x (e.g. $\sim$256 2x2 TPUv2*days / 20,000 GPU*hour $\rightarrow$ 2,000 GPU*hour for MNAS), and guarantees better performance on various vision tasks.

PDF Abstract