How to Start Training: The Effect of Initialization and Architecture

NeurIPS 2018 Boris HaninDavid Rolnick

We identify and study two common failure modes for early training in deep ReLU nets. For each we give a rigorous proof of when it occurs and how to avoid it, for fully connected and residual architectures... (read more)

PDF Abstract

Code


No code implementations yet. Submit your code now

Tasks


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.