# The Impact of the Mini-batch Size on the Variance of Gradients in Stochastic Gradient Descent

27 Apr 2020Xin QianDiego Klabjan

The mini-batch stochastic gradient descent (SGD) algorithm is widely used in training machine learning models, in particular deep learning models. We study SGD dynamics under linear regression and two-layer linear networks, with an easy extension to deeper linear networks, by focusing on the variance of the gradients, which is the first study of this nature... (read more)

PDF Abstract