Neural Architecture Transfer

Neural architecture search (NAS) has emerged as a promising avenue for automatically designing task-specific neural networks. Existing NAS approaches require one complete search for each deployment specification of hardware or objective. This is a computationally impractical endeavor given the potentially large number of application scenarios. In this paper, we propose Neural Architecture Transfer (NAT) to overcome this limitation. NAT is designed to efficiently generate task-specific custom models that are competitive under multiple conflicting objectives. To realize this goal we learn task-specific supernets from which specialized subnets can be sampled without any additional training. The key to our approach is an integrated online transfer learning and many-objective evolutionary search procedure. A pre-trained supernet is iteratively adapted while simultaneously searching for task-specific subnets. We demonstrate the efficacy of NAT on 11 benchmark image classification tasks ranging from large-scale multi-class to small-scale fine-grained datasets. In all cases, including ImageNet, NATNets improve upon the state-of-the-art under mobile settings ($\leq$ 600M Multiply-Adds). Surprisingly, small-scale fine-grained datasets benefit the most from NAT. At the same time, the architecture search and transfer is orders of magnitude more efficient than existing NAS methods. Overall, the experimental evaluation indicates that, across diverse image classification tasks and computational objectives, NAT is an appreciably more effective alternative to conventional transfer learning of fine-tuning weights of an existing network architecture learned on standard datasets. Code is available at https://github.com/human-analysis/neural-architecture-transfer

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Image Classification CIFAR-10 NAT-M1 Percentage correct 97.4 # 82
PARAMS 4.3M # 191
Image Classification CIFAR-10 NAT-M3 Percentage correct 98.2 # 46
PARAMS 6.2M # 196
Image Classification CIFAR-10 NAT-M4 Percentage correct 98.4 # 38
PARAMS 6.9M # 197
Neural Architecture Search CIFAR-10 NAT-M4 Top-1 Error Rate 1.6% # 1
Search Time (GPU days) 1.0 # 18
Parameters 6.9M # 39
FLOPS 468M # 35
Neural Architecture Search CIFAR-10 NAT-M3 Top-1 Error Rate 1.8% # 3
Search Time (GPU days) 1.0 # 18
Parameters 6.2M # 38
FLOPS 392M # 34
Neural Architecture Search CIFAR-10 NAT-M2 Top-1 Error Rate 2.1% # 8
Search Time (GPU days) 1.0 # 18
Parameters 4.6M # 36
FLOPS 291M # 32
Neural Architecture Search CIFAR-10 NAT-M1 Top-1 Error Rate 2.6% # 28
Search Time (GPU days) 1.0 # 18
Parameters 4.3M # 35
FLOPS 232M # 31
Image Classification CIFAR-10 NAT-M2 Percentage correct 97.9 # 60
PARAMS 4.6M # 193
Image Classification CIFAR-100 NAT-M4 Percentage correct 88.3 # 39
PARAMS 9.0M # 185
Neural Architecture Search CIFAR-100 NAT-M3 FLOPS 492M # 11
Percentage Error 12.3 # 3
PARAMS 7.8M # 10
Image Classification CIFAR-100 NAT-M1 Percentage correct 86.0 # 57
PARAMS 3.8M # 182
Image Classification CIFAR-100 NAT-M3 Percentage correct 87.7 # 42
PARAMS 7.8M # 184
Neural Architecture Search CIFAR-100 NAT-M2 FLOPS 398M # 9
Percentage Error 12.5 # 4
PARAMS 6.4M # 9
Image Classification CIFAR-100 NAT-M2 Percentage correct 87.5 # 44
PARAMS 6.4M # 183
Neural Architecture Search CIFAR-100 NAT-M4 FLOPS 796M # 12
Percentage Error 11.7 # 1
PARAMS 9.0M # 11
Neural Architecture Search CIFAR-100 NAT-M1 FLOPS 261M # 8
Percentage Error 14.0 # 6
PARAMS 3.8M # 8
Neural Architecture Search CIFAR-10 Image Classification NAT-M1 Percentage error 2.6 # 15
Params 4.3M # 10
FLOPS 232M # 15
Neural Architecture Search CIFAR-10 Image Classification NAT-M4 Percentage error 1.6 # 1
Params 6.9M # 15
FLOPS 468M # 18
Neural Architecture Search CIFAR-10 Image Classification NAT-M3 Percentage error 1.8 # 2
Params 6.2M # 14
FLOPS 392M # 17
Neural Architecture Search CIFAR-10 Image Classification NAT-M2 Percentage error 2.1 # 7
Params 4.6M # 11
FLOPS 291M # 16
Image Classification CINIC-10 NAT-M3 Accuracy 94.3 # 4
FLOPS 501M # 10
PARAMS 8.1M # 10
Neural Architecture Search CINIC-10 NAT-M2 Accuracy (%) 94.1 # 3
FLOPS 411M # 2
PARAMS 6.2M # 2
Neural Architecture Search CINIC-10 NAT-M1 Accuracy (%) 93.4 # 4
FLOPS 317M # 1
PARAMS 4.6M # 1
Neural Architecture Search CINIC-10 NAT-M4 Accuracy (%) 94.8 # 1