Discrete Model Compression With Resource Constraint for Deep Neural Networks

CVPR 2020 · Shangqian Gao, Feihu Huang, Jian Pei, Heng Huang ·

In this paper, we target to address the problem of compression and acceleration of Convolutional Neural Networks (CNNs). Specifically, we propose a novel structural pruning method to obtain a compact CNN with strong discriminative power. To find such networks, we propose an efficient discrete optimization method to directly optimize channel-wise differentiable discrete gate under resource constraint while freezing all the other model parameters. Although directly optimizing discrete variables is a complex non-smooth, non-convex and NP-hard problem, our optimization method can circumvent these difficulties by using the straight-through estimator. Thus, our method is able to ensure that the sub-network discovered within the training process reflects the true sub-network. We further extend the discrete gate to its stochastic version in order to thoroughly explore the potential sub-networks. Unlike many previous methods requiring per-layer hyper-parameters, we only require one hyper-parameter to control FLOPs budget. Moreover, our method is globally discrimination-aware due to the discrete setting. The experimental results on CIFAR-10 and ImageNet show that our method is competitive with state-of-the-art methods.

PDF Abstract