Noisy Softmax: Improving the Generalization Ability of DCNN via Postponing the Early Softmax Saturation

Over the past few years, softmax and SGD have become a commonly used component and the default training strategy in CNN frameworks, respectively. However, when optimizing CNNs with SGD, the saturation behavior behind softmax always gives us an illusion of training well and then is omitted... (read more)

PDF Abstract CVPR 2017 PDF CVPR 2017 Abstract
No code implementations yet. Submit your code now

Tasks


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Softmax
Output Functions
SGD
Stochastic Optimization