no code implementations • 6 Feb 2019 • Akshay Rangamani, Nam H. Nguyen, Abhishek Kumar, Dzung Phan, Sang H. Chin, Trac. D. Tran
It has been empirically observed that the flatness of minima obtained from training deep networks seems to correlate with better generalization.