|Trend||Dataset||Best Method||Paper title||Paper||Code||Compare|
We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms.
In our experiments, we search for the best convolutional layer (or "cell") on the CIFAR-10 dataset and then apply this cell to the ImageNet dataset by stacking together more copies of this cell, each with their own parameters to design a convolutional architecture, named "NASNet architecture".
#6 best model for Image Classification on ImageNet
In this paper, we propose a novel framework enabling Bayesian optimization to guide the network morphism for efficient neural architecture search.
This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner.
#15 best model for Language Modelling on Penn Treebank (Word Level)
The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on the validation set.
#6 best model for Architecture Search on CIFAR-10 Image Classification
In this paper, we explore a more diverse set of connectivity patterns through the lens of randomly wired neural networks.
#10 best model for Image Classification on ImageNet
We address the high memory consumption issue of differentiable NAS and reduce the computational cost (GPU hours and GPU memory) to the same level of regular training while still allowing a large candidate set.
Our observations are consistent for multiple network architectures, datasets, and tasks, which imply that: 1) training a large, over-parameterized model is often not necessary to obtain an efficient final model, 2) learned "important" weights of the large model are typically not useful for the small pruned model, 3) the pruned architecture itself, rather than a set of inherited "important" weights, is more crucial to the efficiency in the final model, which suggests that in some cases pruning can be useful as an architecture search paradigm.
Designing architectures for deep neural networks requires expert knowledge and substantial computation time.