Search Results for author: Mingwei Wei

Found 3 papers, 0 papers with code

Implicit Regularization of SGD via Thermophoresis

no code implementations • 1 Jan 2021 • Mingwei Wei, David J. Schwab

The strength of this effect is proportional to squared learning rate and inverse batch size, and is more effective during the early phase of training when the model's predictions are poor.

Paper
Add Code

How noise affects the Hessian spectrum in overparameterized neural networks

no code implementations • 1 Oct 2019 • Mingwei Wei, David J. Schwab

Stochastic gradient descent (SGD) forms the core optimization method for deep neural networks.

Paper
Add Code

Mean-field Analysis of Batch Normalization

no code implementations • ICLR 2019 • Mingwei Wei, James Stokes, David J. Schwab

Batch Normalization (BatchNorm) is an extremely useful component of modern neural network architectures, enabling optimization using higher learning rates and achieving faster convergence.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.