Efficient Bi-level Optimization for Non-smooth Optimization

29 Sep 2021 · Wanli Shi, Heng Huang, Bin Gu ·

Bi-level optimization plays a key role in a lot of machine learning applications. However, existing state-of-the-art bi-level optimization methods are limited to smooth or some specific non-smooth lower-level problems. Even worse, most of them depend on approximating hypergradients to update upper-level variable which is the inherent reason for non-efficiency. Currently, achieving a generalized and efficient optimization algorithm for bi-level problems with a non-smooth, even non-Lipschitz continuous lower-level objective is still an open question to the best of our knowledge. To address these challenging problems, in this paper, we propose a new bi-level optimization algorithm based on the smoothing and penalty techniques. Specifically, we first produce a sequence of smoothed lower-level objectives with an exponential decay smoothing parameter for the non-smooth lower-level problem. Then, we transform the smoothed bi-level optimization to an unconstrained penalty problem by replacing the smoothed sub-problem with its first-order necessary conditions. Finally, we update the upper and lower-level variables alternately with doubly stochastic gradients of the unconstrained penalty problem. Importantly, we provide the theoretical analysis to show that our method can converge to a stationary point of original non-smooth bi-level problem if the lower-level problem is convex, and we give the necessary condition of the original problem if the lower-level problem is nonconvex. We compare our method with existing state-of-the-art bi-level optimization methods in three tasks, and all the experimental results demonstrate that our method is superior to the others in terms of accuracy and efficiency.

PDF Abstract