Positively Scale-Invariant Flatness of ReLU Neural Networks

6 Mar 2019 Mingyang Yi Qi Meng Wei Chen Zhi-Ming Ma Tie-Yan Liu

It was empirically confirmed by Keskar et al.\cite{SharpMinima} that flatter minima generalize better. However, for the popular ReLU network, sharp minimum can also generalize well \cite{SharpMinimacan}... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
ReLU
Activation Functions