Bias-Variance Tradeoff in a Sliding Window Implementation of the Stochastic Gradient Algorithm

25 Oct 2019  ·  Yakup Ceki Papo ·

This paper provides a framework to analyze stochastic gradient algorithms in a mean squared error (MSE) sense using the asymptotic normality result of the stochastic gradient descent (SGD) iterates. We perform this analysis by taking the asymptotic normality result and applying it to the finite iteration case. Specifically, we look at problems where the gradient estimators are biased and have reduced variance and compare the iterates generated by these gradient estimators to the iterates generated by the SGD algorithm. We use the work of Fabian to characterize the mean and the variance of the distribution of the iterates in terms of the bias and the covariance matrix of the gradient estimators. We introduce the sliding window SGD (SW-SGD) algorithm, with its proof of convergence, which incurs a lower MSE than the SGD algorithm on quadratic and convex problems. Lastly, we present some numerical results to show the effectiveness of this framework and the superiority of SW-SGD algorithm over the SGD algorithm.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods