no code implementations • 23 Dec 2023 • Lu Xia, Stefano Massei
Adaptive first-order optimizers are fundamental tools in deep learning, although they may suffer from poor generalization due to the nonuniform gradient scaling.
no code implementations • 23 Jan 2023 • Lu Xia, Michiel E. Hochstenbach, Stefano Massei
When training neural networks with low-precision computation, rounding errors often cause stagnation or are detrimental to the convergence of the optimizers; in this paper we study the influence of rounding errors on the convergence of the gradient descent method for problems satisfying the Polyak-Lojasiewicz inequality.
no code implementations • 24 Feb 2022 • Lu Xia, Stefano Massei, Michiel E. Hochstenbach, Barry Koren
When implementing the gradient descent method in low precision, the employment of stochastic rounding schemes helps to prevent stagnation of convergence caused by the vanishing gradient effect.
no code implementations • 24 Jan 2020 • Daniel Kressner, Jonas Latz, Stefano Massei, Elisabeth Ullmann
Many techniques for data science and uncertainty quantification demand efficient tools to handle Gaussian random fields, which are defined in terms of their mean functions and covariance operators.