1 code implementation • NeurIPS 2021 • Quynh Nguyen, Pierre Brechet, Marco Mondelli
More specifically, we show that: (i) under generic assumptions on the features of intermediate layers, it suffices that the last two hidden layers have order of $\sqrt{N}$ neurons, and (ii) if subsets of features at each layer are linearly separable, then no over-parameterization is needed to show the connectivity.