no code implementations • 12 May 2020 • Pierre-Yves Massé, Yann Ollivier
This is more data-agnostic and creates differences with respect to standard SGD theory, especially for the range of possible learning rates.
no code implementations • 8 Nov 2015 • Pierre-Yves Massé, Yann Ollivier
The practical performance of online stochastic gradient descent algorithms is highly dependent on the chosen step size, which must be tediously hand-tuned in many applications.