Batch Normalization: Accelerating Deep Network Training byReducing Internal Covariate Shift

Training Deep Neural Networks is complicated by the factthat the distribution of each layer’s inputs changes duringtraining, as the parameters of the previous layers change.This slows down the training by requiring lower learningrates and careful parameter initialization, and makes it no-toriously hard to train models with saturating nonlineari-ties. We refer to this phenomenon asinternal covariateshift, and address the problem by normalizing layer in-puts... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper