Optimal Algorithms for Convex Nested Stochastic Composite Optimization

19 Nov 2020 · Zhe Zhang, Guanghui Lan ·

Recently, convex nested stochastic composite optimization (NSCO) has received considerable attention for its applications in reinforcement learning and risk-averse optimization. The current NSCO algorithms have worse stochastic oracle complexities, by orders of magnitude, than those for simpler stochastic composite optimization problems (e.g., sum of smooth and nonsmooth functions) without the nested structure. Moreover, they require all outer-layer functions to be smooth, which is not satisfied by some important applications. These discrepancies prompt us to ask: ``does the nested composition make stochastic optimization more difficult in terms of the order of oracle complexity?" In this paper, we answer the question by developing order-optimal algorithms for the convex NSCO problem constructed from an arbitrary composition of smooth, structured non-smooth and general non-smooth layer functions. When all outer-layer functions are smooth, we propose a stochastic sequential dual (SSD) method to achieve an oracle complexity of $\mathcal{O}(1/\epsilon^2)$ ($\mathcal{O}(1/\epsilon)$) when the problem is non-strongly (strongly) convex. When there exists some structured non-smooth or general non-smooth outer-layer function, we propose a nonsmooth stochastic sequential dual (nSSD) method to achieve an oracle complexity of $\mathcal{O}(1/\epsilon^2)$. We provide a lower complexity bound to show the latter $\mathcal{O}(1/\epsilon^2)$ complexity to be unimprovable even under a strongly convex setting. All these complexity results seem to be new in the literature and they indicate that the convex NSCO problem has the same order of oracle complexity as those without the nested composition in all but the strongly convex and outer-non-smooth problem.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Stochastic Optimization

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Add Remove

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

1x1 Convolution • Convolution • Non Maximum Suppression • SSD

Edit Social Preview

Optimal Algorithms for Convex Nested Stochastic Composite Optimization

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove