Differentiable Neural Architecture Search

Introduced by Wu et al. in FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search

DNAS, or Differentiable Neural Architecture Search, uses gradient-based methods to optimize ConvNet architectures, avoiding enumerating and training individual architectures separately as in previous methods. DNAS allows us to explore a layer-wise search space where we can choose a different block for each layer of the network. DNAS represents the search space by a super net whose operators execute stochastically. It relaxes the problem of finding the optimal architecture to find a distribution that yields the optimal architecture. By using the Gumbel Softmax technique, it is possible to directly train the architecture distribution using gradient-based optimization such as SGD.

The loss used to train the stochastic super net consists of both the cross-entropy loss that leads to better accuracy and the latency loss that penalizes the network's latency on a target device. To estimate the latency of an architecture, the latency of each operator in the search space is measured and a lookup table model is used to compute the overall latency by adding up the latency of each operator. Using this model allows for estimation of the latency of architectures in an enormous search space. More importantly, it makes the latency differentiable with respect to layer-wise block choices.

Source: FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Image Classification	2	16.67%
Reinforcement Learning (RL)	2	16.67%
Time Series Classification	1	8.33%
Evolutionary Algorithms	1	8.33%
Object Detection	1	8.33%
Model Compression	1	8.33%
Quantization	1	8.33%
Super-Resolution	1	8.33%
Benchmarking	1	8.33%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
Gumbel Softmax	Distributions

Categories

Add Remove

Neural Architecture Search