no code implementations • 23 Aug 2024 • Lukas Gonon, Arnulf Jentzen, Benno Kuckuck, Siyu Liang, Adrian Riekert, Philippe von Wurstemberger
While approximation methods for PDEs using ANNs have first been proposed in the 1990s they have only gained wide popularity in the last decade with the rise of deep learning.
no code implementations • 29 Jul 2024 • Steffen Dereich, Arnulf Jentzen
In practically relevant training problems, usually not the plain vanilla standard SGD method is the employed optimization scheme but instead suitably accelerated and adaptive SGD optimization methods are applied.
no code implementations • 11 Jul 2024 • Steffen Dereich, Robin Graeber, Arnulf Jentzen
Deep learning algorithms - typically consisting of a class of deep neural networks trained by a stochastic gradient descent (SGD) optimization method - are nowadays the key ingredients in many artificial intelligence (AI) systems and have revolutionized our ways of working and living in modern societies.
1 code implementation • 20 Jun 2024 • Steffen Dereich, Arnulf Jentzen, Adrian Riekert
In this work we propose and study a learning-rate-adaptive approach for SGD optimization methods in which the learning rate is adjusted based on empirical estimates for the values of the objective function of the considered optimization problem (the function that one intends to minimize).
no code implementations • 16 Jun 2024 • Julia Ackermann, Arnulf Jentzen, Benno Kuckuck, Joshua Lee Padgett
It is a challenging topic in applied mathematics to solve high-dimensional nonlinear partial differential equations (PDEs).
no code implementations • 7 Feb 2024 • Arnulf Jentzen, Adrian Riekert
In this work we solve this research problem in the situation of shallow ANNs with the rectified linear unit (ReLU) and related activations with the standard mean square error loss by disproving in the training of such ANNs that SGD methods (such as the plain vanilla SGD, the momentum SGD, the AdaGrad, the RMSprop, and the Adam optimizers) can find a global minimizer with high probability.
1 code implementation • 31 Oct 2023 • Arnulf Jentzen, Benno Kuckuck, Philippe von Wurstemberger
This book aims to provide an introduction to the topic of deep learning algorithms.
no code implementations • 24 Sep 2023 • Julia Ackermann, Arnulf Jentzen, Thomas Kruse, Benno Kuckuck, Joshua Lee Padgett
Recently, several deep learning (DL) methods for approximating high-dimensional partial differential equations (PDEs) have been proposed.
no code implementations • 28 Feb 2023 • Steffen Dereich, Arnulf Jentzen, Sebastian Kassing
Many mathematical convergence results for gradient descent (GD) based algorithms employ the assumption that the GD process is (almost surely) bounded and, also in concrete numerical simulations, divergence of the GD process may slow down, or even completely rule out, convergence of the error function.
1 code implementation • 7 Feb 2023 • Arnulf Jentzen, Adrian Riekert, Philippe von Wurstemberger
In the tested numerical examples the ADANN methodology significantly outperforms existing traditional approximation algorithms as well as existing deep operator learning methodologies from the literature.
no code implementations • 19 Jan 2023 • Lukas Gonon, Robin Graeber, Arnulf Jentzen
In particular, it is a key contribution of this work to reveal that for all $a, b\in\mathbb{R}$ with $b-a\geq 7$ we have that the functions $[a, b]^d\ni x=(x_1,\dots, x_d)\mapsto\prod_{i=1}^d x_i\in\mathbb{R}$ for $d\in\mathbb{N}$ as well as the functions $[a, b]^d\ni x =(x_1,\dots, x_d)\mapsto\sin(\prod_{i=1}^d x_i) \in \mathbb{R} $ for $ d \in \mathbb{N} $ can neither be approximated without the curse of dimensionality by means of shallow ANNs nor insufficiently deep ANNs with ReLU activation but can be approximated without the curse of dimensionality by sufficiently deep ANNs with ReLU activation.
no code implementations • 3 Aug 2022 • Patrick Cheridito, Arnulf Jentzen, Florian Rossmannek
Dynamical systems theory has recently been applied in optimization to prove that gradient descent algorithms bypass so-called strict saddle points of the loss function.
no code implementations • 13 Jul 2022 • Simon Eberle, Arnulf Jentzen, Adrian Riekert, Georg Weiss
The training of artificial neural networks (ANNs) is nowadays a highly relevant algorithmic procedure with many applications in science and industry.
no code implementations • 27 Jun 2022 • Arnulf Jentzen, Timo Kröger
Furthermore, we prove that this upper bound only holds for sums of powers of the Lipschitz norm with the exponents $ 1/2 $ and $ 1 $ but does not hold for the Lipschitz norm alone.
no code implementations • 7 May 2022 • Victor Boussange, Sebastian Becker, Arnulf Jentzen, Benno Kuckuck, Loïc Pellissier
We evaluate the performance of the two methods on five different PDEs arising in physics and biology.
no code implementations • 17 Dec 2021 • Arnulf Jentzen, Adrian Riekert
In this article we study fully-connected feedforward deep ReLU ANNs with an arbitrarily large number of hidden layers and we prove convergence of the risk of the GD optimization method with random initializations in the training of such ANNs under the assumption that the unnormalized probability density function of the probability distribution of the input data of the considered supervised learning problem is piecewise polynomial, under the assumption that the target function (describing the relationship between input data and the output data) is piecewise polynomial, and under the assumption that the risk function of the considered supervised learning problem admits at least one regular global minimum.
no code implementations • 13 Dec 2021 • Martin Hutzenthaler, Arnulf Jentzen, Katharina Pohl, Adrian Riekert, Luca Scarpa
In many numerical simulations stochastic gradient descent (SGD) type optimization methods perform very effectively in the training of deep neural networks (DNNs) but till this day it remains an open problem of research to provide a mathematical convergence analysis which rigorously explains the success of SGD type optimization methods in the training of DNNs.
no code implementations • 18 Aug 2021 • Simon Eberle, Arnulf Jentzen, Adrian Riekert, Georg S. Weiss
In the second main result of this article we prove in the training of such ANNs under the assumption that the target function and the density function of the probability distribution of the input data are piecewise polynomial that every non-divergent GF trajectory converges with an appropriate rate of convergence to a critical point and that the risk of the non-divergent GF trajectory converges with rate 1 to the risk of the critical point.
no code implementations • 10 Aug 2021 • Arnulf Jentzen, Adrian Riekert
Despite the great success of GD type optimization methods in numerical simulations for the training of ANNs with ReLU activation, it remains - even in the simplest situation of the plain vanilla GD optimization method with random initializations and ANNs with one hidden layer - an open problem to prove (or disprove) the conjecture that the risk of the GD optimization method converges in the training of such ANNs to zero as the width of the ANNs, the number of independent random initializations, and the number of GD steps increase to infinity.
no code implementations • 9 Jul 2021 • Arnulf Jentzen, Adrian Riekert
Finally, in the special situation where there is only one neuron on the hidden layer (1-dimensional hidden layer) we strengthen the above named result for affine linear target functions by proving that that the risk of every (not necessarily bounded) GF trajectory converges to zero if the initial risk is sufficiently small.
no code implementations • 1 Apr 2021 • Arnulf Jentzen, Adrian Riekert
In this article we study the stochastic gradient descent (SGD) optimization method in the training of fully-connected feedforward artificial neural networks with ReLU activation.
no code implementations • 19 Mar 2021 • Patrick Cheridito, Arnulf Jentzen, Florian Rossmannek
In this paper, we analyze the landscape of the true loss of neural networks with one hidden layer and ReLU, leaky ReLU, or quadratic activation.
no code implementations • 23 Feb 2021 • Arnulf Jentzen, Timo Kröger
In recent years, artificial neural networks have developed into a powerful tool for dealing with a multitude of problems for which classical solution approaches reach their limits.
no code implementations • 19 Feb 2021 • Patrick Cheridito, Arnulf Jentzen, Adrian Riekert, Florian Rossmannek
This Lyapunov function is the central tool in our convergence proof of the gradient descent method.
no code implementations • 22 Dec 2020 • Christian Beck, Martin Hutzenthaler, Arnulf Jentzen, Benno Kuckuck
It is one of the most challenging problems in applied mathematics to approximatively solve high-dimensional partial differential equations (PDEs).
no code implementations • 15 Dec 2020 • Arnulf Jentzen, Adrian Riekert
Although deep learning based approximation algorithms have been applied very successfully to numerous problems, at the moment the reasons for their performance are not entirely understood from a mathematical point of view.
no code implementations • 2 Dec 2020 • Christian Beck, Sebastian Becker, Patrick Cheridito, Arnulf Jentzen, Ariel Neufeld
In this article we introduce and study a deep learning based approximation algorithm for solutions of stochastic partial differential equations (SPDEs).
no code implementations • 3 Jul 2020 • Aritz Bercher, Lukas Gonon, Arnulf Jentzen, Diyora Salimova
In applications one is often not only interested in the size of the error with respect to the objective function but also in the size of the error with respect to a test function which is possibly different from the objective function.
no code implementations • 12 Jun 2020 • Patrick Cheridito, Arnulf Jentzen, Florian Rossmannek
Deep neural networks have successfully been trained in various application areas with stochastic gradient descent.
no code implementations • 3 Jun 2020 • Fabian Hornung, Arnulf Jentzen, Diyora Salimova
Each of these results establishes that DNNs overcome the curse of dimensionality in approximating suitable PDE solutions at a fixed time point $T>0$ and on a compact cube $[a, b]^d$ in space but none of these results provides an answer to the question whether the entire PDE solution on $[0, T]\times [a, b]^d$ can be approximated by DNNs without the curse of dimensionality.
no code implementations • 3 Mar 2020 • Arnulf Jentzen, Timo Welti
In spite of the accomplishments of deep learning based algorithms in numerous applications and very broad corresponding research interest, at the moment there is still no rigorous understanding of the reasons why such algorithms produce useful results in certain situations.
1 code implementation • 23 Dec 2019 • Sebastian Becker, Patrick Cheridito, Arnulf Jentzen
In this paper we introduce a deep learning method for pricing and hedging American-style options.
no code implementations • 9 Dec 2019 • Patrick Cheridito, Arnulf Jentzen, Florian Rossmannek
In this paper, we develop a framework for showing that neural networks can overcome the curse of dimensionality in different high-dimensional approximation problems.
Numerical Analysis Numerical Analysis 68T07 I.2.0
no code implementations • 20 Nov 2019 • Lukas Gonon, Philipp Grohs, Arnulf Jentzen, David Kofler, David Šiška
These mathematical results from the scientific literature prove in part that algorithms based on ANNs are capable of overcoming the curse of dimensionality in the numerical approximation of high-dimensional PDEs.
no code implementations • 30 Sep 2019 • Christan Beck, Arnulf Jentzen, Benno Kuckuck
In this work we estimate for a certain deep learning algorithm each of these three errors and combine these three error estimates to obtain an overall error analysis for the deep learning algorithm under consideration.
1 code implementation • 28 Aug 2019 • Philipp Grohs, Arnulf Jentzen, Diyora Salimova
One key argument in most of these results is, first, to use a Monte Carlo approximation scheme which can approximate the solution of the PDE under consideration at a fixed space-time point without the curse of dimensionality and, thereafter, to prove that DNNs are flexible enough to mimic the behaviour of the used approximation scheme.
no code implementations • 11 Aug 2019 • Philipp Grohs, Fabian Hornung, Arnulf Jentzen, Philipp Zimmermann
It is the subject of the main result of this article to provide space-time error estimates for DNN approximations of Euler approximations of certain perturbed differential equations.
no code implementations • 5 Aug 2019 • Sebastian Becker, Patrick Cheridito, Arnulf Jentzen, Timo Welti
We present numerical results for a large number of example problems, which include the pricing of many high-dimensional American and Bermudan options, such as Bermudan max-call options in up to 5000 dimensions.
no code implementations • 8 Jul 2019 • Christian Beck, Sebastian Becker, Patrick Cheridito, Arnulf Jentzen, Ariel Neufeld
In this paper we introduce a numerical method for nonlinear parabolic PDEs that combines operator splitting with deep learning.
no code implementations • 13 May 2019 • Julius Berner, Dennis Elbrächter, Philipp Grohs, Arnulf Jentzen
Although for neural networks with locally Lipschitz continuous activation functions the classical derivative exists almost everywhere, the standard chain rule is in general not applicable.
no code implementations • 2 Apr 2019 • Benjamin Fehrman, Benjamin Gess, Arnulf Jentzen
We prove the local convergence to minima and estimates on the rate of convergence for the stochastic gradient descent method in the case of not necessarily globally convex nor contracting objective functions.
no code implementations • 19 Sep 2018 • Arnulf Jentzen, Diyora Salimova, Timo Welti
These numerical simulations indicate that DNNs seem to possess the fundamental flexibility to overcome the curse of dimensionality in the sense that the number of real parameters used to describe the DNN grows at most polynomially in both the reciprocal of the prescribed approximation accuracy $ \varepsilon > 0 $ and the dimension $ d \in \mathbb{N}$ of the function which the DNN aims to approximate in such computational problems.
no code implementations • 9 Sep 2018 • Julius Berner, Philipp Grohs, Arnulf Jentzen
It can be concluded that ERM over deep neural network hypothesis classes overcomes the curse of dimensionality for the numerical solution of linear Kolmogorov equations with affine coefficients.
no code implementations • 7 Sep 2018 • Philipp Grohs, Fabian Hornung, Arnulf Jentzen, Philippe von Wurstemberger
Such numerical simulations suggest that ANNs have the capacity to very efficiently approximate high-dimensional functions and, especially, indicate that ANNs seem to admit the fundamental power to overcome the curse of dimensionality when approximating the high-dimensional functions appearing in the above named computational problems.
no code implementations • 1 Jun 2018 • Christian Beck, Sebastian Becker, Philipp Grohs, Nor Jaafari, Arnulf Jentzen
Stochastic differential equations (SDEs) and the Kolmogorov partial differential equations (PDEs) associated to them have been widely used in models from engineering, finance, and the natural sciences.
no code implementations • 22 Mar 2018 • Arnulf Jentzen, Philippe von Wurstemberger
The stochastic gradient descent (SGD) optimization algorithm plays a central role in a series of machine learning applications.
no code implementations • 29 Jan 2018 • Arnulf Jentzen, Benno Kuckuck, Ariel Neufeld, Philippe von Wurstemberger
Stochastic gradient descent (SGD) optimization algorithms are key ingredients in a series of machine learning applications.
Numerical Analysis Probability
no code implementations • 18 Sep 2017 • Christian Beck, Weinan E, Arnulf Jentzen
The PDEs in such applications are high-dimensional as the dimension corresponds to the number of financial assets in a portfolio.
6 code implementations • 9 Jul 2017 • Jiequn Han, Arnulf Jentzen, Weinan E
Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality".
5 code implementations • 15 Jun 2017 • Weinan E, Jiequn Han, Arnulf Jentzen
We propose a new algorithm for solving parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) in high dimension, by making an analogy between the BSDE and reinforcement learning with the gradient of the solution playing the role of the policy function, and the loss function given by the error between the prescribed terminal condition and the solution of the BSDE.