no code implementations • 7 Feb 2024 • Petr Ostroukhov, Aigerim Zhumabayeva, Chulu Xiang, Alexander Gasnikov, Martin Takáč, Dmitry Kamzolov
To substantiate the efficacy of our method, we experimentally show, how the introduction of adaptive step size and adaptive batch size gradually improves the performance of regular SGD.
1 code implementation • 28 Dec 2023 • Farshed Abdukhakimov, Chulu Xiang, Dmitry Kamzolov, Robert Gower, Martin Takáč
Adaptive optimization methods are widely recognized as among the most popular approaches for training Deep Neural Networks (DNNs).
1 code implementation • 3 Oct 2023 • Farshed Abdukhakimov, Chulu Xiang, Dmitry Kamzolov, Martin Takáč
Stochastic Gradient Descent (SGD) is one of the many iterative optimization methods that are widely used in solving machine learning problems.