Search Results for author: Jonathan W. Siegel

Found 14 papers, 1 papers with code

Equivariant Frames and the Impossibility of Continuous Canonicalization

no code implementations • 25 Feb 2024 • Nadav Dym, Hannah Lawrence, Jonathan W. Siegel

Canonicalization provides an architecture-agnostic method for enforcing equivariance, with generalizations such as frame-averaging recently gaining prominence as a lightweight and flexible alternative to equivariant architectures.

Paper
Add Code

A qualitative difference between gradient flows of convex functions in finite- and infinite-dimensional Hilbert spaces

no code implementations • 26 Oct 2023 • Jonathan W. Siegel, Stephan Wojtowytsch

In the case of stochastic gradient descent, the summability of $\mathbb E[f(x_n) - \inf f]$ is used to prove that $f(x_n)\to \inf f$ almost surely - an improvement on the convergence almost surely up to a subsequence which follows from the $O(1/n)$ decay estimate.

Paper
Add Code

Weighted variation spaces and approximation by shallow ReLU networks

no code implementations • 28 Jul 2023 • Ronald DeVore, Robert D. Nowak, Rahul Parhi, Jonathan W. Siegel

A new and more proper definition of model classes on domains is given by introducing the concept of weighted variation spaces.

Paper
Add Code

Optimal Approximation of Zonoids and Uniform Approximation by Shallow Neural Networks

no code implementations • 28 Jul 2023 • Jonathan W. Siegel

The second is to determine optimal approximation rates in the uniform norm for shallow ReLU$^k$ neural networks on their variation spaces.

Paper
Add Code

Sharp Convergence Rates for Matching Pursuit

no code implementations • 15 Jul 2023 • Jason M. Klusowski, Jonathan W. Siegel

We study the fundamental limits of matching pursuit, or the pure greedy algorithm, for approximating a target function by a sparse linear combination of elements from a dictionary.

Paper
Add Code

Sharp Lower Bounds on Interpolation by Deep ReLU Neural Networks at Irregularly Spaced Data

no code implementations • 2 Feb 2023 • Jonathan W. Siegel

Specifically, we consider the question of how efficiently, in terms of the number of parameters, deep ReLU networks can interpolate values at $N$ datapoints in the unit ball which are separated by a distance $\delta$.

Memorization

Paper
Add Code

Optimal Approximation Rates for Deep ReLU Neural Networks on Sobolev and Besov Spaces

no code implementations • 25 Nov 2022 • Jonathan W. Siegel

We study the problem of how efficiently, in terms of the number of parameters, deep neural networks with the ReLU activation function can approximate functions in the Sobolev spaces $W^s(L_q(\Omega))$ and Besov spaces $B^s_r(L_q(\Omega))$, with error measured in the $L_p(\Omega)$ norm.

Paper
Add Code

On the Activation Function Dependence of the Spectral Bias of Neural Networks

no code implementations • 9 Aug 2022 • Qingguo Hong, Jonathan W. Siegel, Qinyang Tan, Jinchao Xu

Our empirical studies also show that neural networks with the Hat activation function are trained significantly faster using stochastic gradient descent and ADAM.

Image Classification

Paper
Add Code

Sharp Lower Bounds on the Approximation Rate of Shallow Neural Networks

no code implementations • 28 Jun 2021 • Jonathan W. Siegel, Jinchao Xu

In this article, we provide a solution to this problem by proving sharp lower bounds on the approximation rates for shallow neural networks, which are obtained by lower bounding the $L^2$-metric entropy of the convex hull of the neural network basis functions.

Paper
Add Code

Characterization of the Variation Spaces Corresponding to Shallow Neural Networks

no code implementations • 28 Jun 2021 • Jonathan W. Siegel, Jinchao Xu

We study the variation space corresponding to a dictionary of functions in $L^2(\Omega)$ for a bounded domain $\Omega\subset \mathbb{R}^d$.

Paper
Add Code

Sharp Bounds on the Approximation Rates, Metric Entropy, and $n$-widths of Shallow Neural Networks

no code implementations • 29 Jan 2021 • Jonathan W. Siegel, Jinchao Xu

This result gives sharp lower bounds on the $L^2$-approximation rates, metric entropy, and $n$-widths for variation spaces corresponding to neural networks with a range of important activation functions, including ReLU$^k$ activation functions and sigmoidal activation functions with bounded variation.

Paper
Add Code

High-Order Approximation Rates for Shallow Neural Networks with Cosine and ReLU$^k$ Activation Functions

no code implementations • 14 Dec 2020 • Jonathan W. Siegel, Jinchao Xu

We show that as the smoothness index $s$ of $f$ increases, shallow neural networks with ReLU$^k$ activation function obtain an improved approximation rate up to a best possible rate of $O(n^{-(k+1)}\log(n))$ in $L^2$, independent of the dimension $d$.

Numerical Analysis Numerical Analysis 41A25

Paper
Add Code

Training Sparse Neural Networks using Compressed Sensing

1 code implementation • 21 Aug 2020 • Jonathan W. Siegel, Jianhong Chen, Pengchuan Zhang, Jinchao Xu

The adaptive weighting we introduce corresponds to a novel regularizer based on the logarithm of the absolute value of the weights.

Paper
Code

Approximation Rates for Neural Networks with General Activation Functions

no code implementations • 4 Apr 2019 • Jonathan W. Siegel, Jinchao Xu

Our first result concerns the rate of approximation of a two layer neural network with a polynomially-decaying non-sigmoidal activation function.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.