# The Impact of the Mini-batch Size on the Variance of Gradients in Stochastic Gradient Descent

27 Apr 2020Xin QianDiego Klabjan

The mini-batch stochastic gradient descent (SGD) algorithm is widely used in training machine learning models, in particular deep learning models. We study SGD dynamics under linear regression and two-layer linear networks, with an easy extension to deeper linear networks, by focusing on the variance of the gradients, which is the first study of this nature... (read more)

