Search Results for author: Stephan Wojtowytsch

Found 17 papers, 2 papers with code

SineNet: Learning Temporal Dynamics in Time-Dependent Partial Differential Equations

1 code implementation • 28 Mar 2024 • Xuan Zhang, Jacob Helwig, Yuchao Lin, Yaochen Xie, Cong Fu, Stephan Wojtowytsch, Shuiwang Ji

While the U-Net architecture with skip connections is commonly used by prior studies to enable multi-scale processing, our analysis shows that the need for features to evolve across layers results in temporally misaligned features in skip connections, which limits the model's performance.

401

Paper
Code

A qualitative difference between gradient flows of convex functions in finite- and infinite-dimensional Hilbert spaces

no code implementations • 26 Oct 2023 • Jonathan W. Siegel, Stephan Wojtowytsch

In the case of stochastic gradient descent, the summability of $\mathbb E[f(x_n) - \inf f]$ is used to prove that $f(x_n)\to \inf f$ almost surely - an improvement on the convergence almost surely up to a subsequence which follows from the $O(1/n)$ decay estimate.

Paper
Add Code

Group Equivariant Fourier Neural Operators for Partial Differential Equations

1 code implementation • 9 Jun 2023 • Jacob Helwig, Xuan Zhang, Cong Fu, Jerry Kurtin, Stephan Wojtowytsch, Shuiwang Ji

We consider solving partial differential equations (PDEs) with Fourier neural operators (FNOs), which operate in the frequency domain.

401

Paper
Code

Achieving acceleration despite very noisy gradients

no code implementations • 10 Feb 2023 • Kanan Gupta, Jonathan Siegel, Stephan Wojtowytsch

We present a generalization of Nesterov's accelerated gradient descent algorithm.

Paper
Add Code

Optimal bump functions for shallow ReLU networks: Weight decay, depth separation and the curse of dimensionality

no code implementations • 2 Sep 2022 • Stephan Wojtowytsch

In this note, we study how neural networks with a single hidden layer and ReLU activation interpolate data drawn from a radially symmetric distribution with target labels 1 at the origin and 0 outside the unit ball, if no labels are known inside the unit ball.

Paper
Add Code

Qualitative neural network approximation over R and C: Elementary proofs for analytic and polynomial activation

no code implementations • 25 Mar 2022 • Josiah Park, Stephan Wojtowytsch

We prove for both real and complex networks with non-polynomial activation that the closure of the class of neural networks coincides with the closure of the space of polynomials.

Paper
Add Code

Stochastic gradient descent with noise of machine learning type. Part II: Continuous time analysis

no code implementations • 4 Jun 2021 • Stephan Wojtowytsch

The representation of functions by artificial neural networks depends on a large number of parameters in a non-linear fashion.

BIG-bench Machine Learning

Paper
Add Code

Stochastic gradient descent with noise of machine learning type. Part I: Discrete time analysis

no code implementations • 4 May 2021 • Stephan Wojtowytsch

Stochastic gradient descent (SGD) is one of the most popular algorithms in modern machine learning.

BIG-bench Machine Learning

Paper
Add Code

On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers

no code implementations • 10 Dec 2020 • Weinan E, Stephan Wojtowytsch

A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer.

Paper
Add Code

Some observations on high-dimensional partial differential equations with Barron data

no code implementations • 2 Dec 2020 • Weinan E, Stephan Wojtowytsch

We use explicit representation formulas to show that solutions to certain partial differential equations lie in Barron spaces or multilayer spaces if the PDE data lie in such function spaces.

Vocal Bursts Intensity Prediction

Paper
Add Code

A priori estimates for classification problems using neural networks

no code implementations • 28 Sep 2020 • Weinan E, Stephan Wojtowytsch

We consider binary and multi-class classification problems using hypothesis classes of neural networks.

Classification General Classification +1

Paper
Add Code

Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't

no code implementations • 22 Sep 2020 • Weinan E, Chao Ma, Stephan Wojtowytsch, Lei Wu

The purpose of this article is to review the achievements made in the last few years towards the understanding of the reasons behind the success and subtleties of neural network-based machine learning.

Paper
Add Code

On the Banach spaces associated with multi-layer ReLU networks: Function representation, approximation theory and gradient descent dynamics

no code implementations • 30 Jul 2020 • Weinan E, Stephan Wojtowytsch

The key to this work is a new way of representing functions in some form of expectations, motivated by multi-layer neural networks.

Paper
Add Code

Representation formulas and pointwise properties for Barron functions

no code implementations • 10 Jun 2020 • Weinan E, Stephan Wojtowytsch

We study the natural function space for infinitely wide two-layer neural networks with ReLU activation (Barron space) and establish different representation formulae.

Paper
Add Code

On the Convergence of Gradient Descent Training for Two-layer ReLU-networks in the Mean Field Regime

no code implementations • 27 May 2020 • Stephan Wojtowytsch

The condition does not depend on the initalization of parameters and concerns only the weak convergence of the realization of the neural network, not its parameter distribution.

Paper
Add Code

Kolmogorov Width Decay and Poor Approximators in Machine Learning: Shallow Neural Networks, Random Feature Models and Neural Tangent Kernels

no code implementations • 21 May 2020 • Weinan E, Stephan Wojtowytsch

We establish a scale separation of Kolmogorov width type between subspaces of a given Banach space under the condition that a sequence of linear maps converges much faster on one of the subspaces.

Paper
Add Code

Can Shallow Neural Networks Beat the Curse of Dimensionality? A mean field training perspective

no code implementations • 21 May 2020 • Stephan Wojtowytsch, Weinan E

Thus gradient descent training for fitting reasonably smooth, but truly high-dimensional data may be subject to the curse of dimensionality.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.