no code implementations • 6 Nov 2023 • Yuan Gao, Rustem Islamov, Sebastian Stich
Error Compensation (EC) is an extremely popular mechanism to mitigate the aforementioned issues during the training of models enhanced by contractive compression operators.
no code implementations • 30 May 2022 • Amirkeivan Mohtashami, Martin Jaggi, Sebastian Stich
However, we show through a novel set of experiments that the stochastic noise is not sufficient to explain good non-convex training, and that instead the effect of a large learning rate itself is essential for obtaining best performance. We demonstrate the same effects also in the noise-less case, i. e. for full-batch GD.
no code implementations • 18 Feb 2022 • Konstantin Mishchenko, Grigory Malinovsky, Sebastian Stich, Peter Richtárik
The canonical approach to solving such problems is via the proximal gradient descent (ProxGD) algorithm, which is based on the evaluation of the gradient of $f$ and the prox operator of $\psi$ in each iteration.
no code implementations • 1 Dec 2013 • Hemant Tyagi, Sebastian Stich, Bernd Gärtner
We consider a stochastic continuum armed bandit problem where the arms are indexed by the $\ell_2$ ball $B_{d}(1+\nu)$ of radius $1+\nu$ in $\mathbb{R}^d$.