Accelerating Neural Network Optimization Through an Automated Control Theory Lens

CVPR 2022 · Jiahao Wang, Baoyuan Wu, Rui Su, Mingdeng Cao, Shuwei Shi, Wanli Ouyang, Yujiu Yang ·

This paper studies the optimizer for accelerating the time-consuming deep network training through an automated control theory lens. We view the parameter update of a network as a feedback control process. It brings two contributions: First, we theoretically analyze the detailed intrinsic connections between deep network training and automatic control feedback system. Specifically, we demonstrate that the optimization process can be viewed as a Type I Second Order System in control field. Second, based on the math model of the equivalent system, we further design a proportional-integral-derivative algorithm type Controller with decoupled weight decay based on control theory to improve the training of deep neural networks. We conduct experiments both from a control theory lens through a phase locus verification and from a network training lens on several models, including CNNs, Transformers, MLPs, and on benchmark datasets. The results demonstrate the effectiveness of our Controller optimizer in both optimization speed and performance compared to SGD, PID Optimizer, Adam, AdamW and AdamP.

PDF Abstract