Topological properties of the set of functions generated by neural networks of fixed size

22 Jun 2018  ·  Philipp Petersen, Mones Raslan, Felix Voigtlaender ·

We analyze the topological properties of the set of functions that can be implemented by neural networks of a fixed size. Surprisingly, this set has many undesirable properties: It is highly non-convex, except possibly for a few exotic activation functions. Moreover, the set is not closed with respect to $L^p$-norms, $0<p<\infty$, for all practically-used activation functions, and also not closed with respect to the $L^\infty$-norm for all practically-used activation functions except for the ReLU and the parametric ReLU. Finally, the function that maps a family of weights to the function computed by the associated network is not inverse stable, for every practically used activation function. In other words, if $f_1, f_2$ are two functions realized by neural networks that are very close in the sense that $\|f_1 - f_2\|_{L^\infty} \leq \varepsilon$, it is usually not possible to find weights $w_1, w_2$ close together such that each $f_i$ is realized by a neural network with weights $w_i$. These observations identify a couple of potential causes for problems in the optimization of neural networks such as no guaranteed convergence, explosion of parameters, and very slow convergence.

PDF Abstract
No code implementations yet. Submit your code now

Categories


General Topology Functional Analysis 54H99, 68T05, 52A30