How to Start Training: The Effect of Initialization and Architecture

NeurIPS 2018 Boris HaninDavid Rolnick

We identify and study two common failure modes for early training in deep ReLU nets. For each we give a rigorous proof of when it occurs and how to avoid it, for fully connected and residual architectures... (read more)

PDF Abstract


No code implementations yet. Submit your code now


Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.