EnTranNAS: Towards Closing the Gap between the Architectures in Search and Evaluation

1 Jan 2021  ·  Yibo Yang, Shan You, Hongyang Li, Fei Wang, Chen Qian, Zhouchen Lin ·

Most gradient-based neural architecture search methods construct a super-net for search and derive a target-net as its sub-graph for evaluation. There is a significant gap between the architectures in search and evaluation. As a result, current methods suffer from the problem of an inaccurate, inefficient, and inflexible search process. In this paper, we aim to close the gap and solve these problems. We introduce EnTranNAS that is composed of Engine-cells and Transit-cells. The Engine-cell is differentiable for architecture search, while the Transit-cell only transits the current sub-graph by architecture derivation. Consequently, the gap between the architectures in search and evaluation is significantly reduced. Our method also spares much memory and computation cost, which speeds up the search process. A feature sharing strategy is introduced for more efficient parameter training in the search phase. Furthermore, we develop a new architecture derivation method to replace the traditional one that is based on a hand-crafted rule. Our method enables differentiable sparsification, so keeps the derived architecture equivalent to the one in search. Besides, it supports the search for topology where a node can be connected to prior nodes with any number of connections, so that the searched architectures could be more flexible. For experiments on CIFAR-10, our search on the standard space requires only 0.06 GPU-day. We further have an error rate of 2.22% with 0.07 GPU-day for the search on an extended space. We can directly perform our search on ImageNet with topology learnable and achieve a top-1 error rate of 23.2%. Code will be released.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here