Search Results for author: Jonathan W. Siegel

Found 14 papers, 1 papers with code

Equivariant Frames and the Impossibility of Continuous Canonicalization

no code implementations25 Feb 2024 Nadav Dym, Hannah Lawrence, Jonathan W. Siegel

Canonicalization provides an architecture-agnostic method for enforcing equivariance, with generalizations such as frame-averaging recently gaining prominence as a lightweight and flexible alternative to equivariant architectures.

A qualitative difference between gradient flows of convex functions in finite- and infinite-dimensional Hilbert spaces

no code implementations26 Oct 2023 Jonathan W. Siegel, Stephan Wojtowytsch

In the case of stochastic gradient descent, the summability of $\mathbb E[f(x_n) - \inf f]$ is used to prove that $f(x_n)\to \inf f$ almost surely - an improvement on the convergence almost surely up to a subsequence which follows from the $O(1/n)$ decay estimate.

Weighted variation spaces and approximation by shallow ReLU networks

no code implementations28 Jul 2023 Ronald DeVore, Robert D. Nowak, Rahul Parhi, Jonathan W. Siegel

A new and more proper definition of model classes on domains is given by introducing the concept of weighted variation spaces.

Optimal Approximation of Zonoids and Uniform Approximation by Shallow Neural Networks

no code implementations28 Jul 2023 Jonathan W. Siegel

The second is to determine optimal approximation rates in the uniform norm for shallow ReLU$^k$ neural networks on their variation spaces.

Sharp Convergence Rates for Matching Pursuit

no code implementations15 Jul 2023 Jason M. Klusowski, Jonathan W. Siegel

We study the fundamental limits of matching pursuit, or the pure greedy algorithm, for approximating a target function by a sparse linear combination of elements from a dictionary.

Sharp Lower Bounds on Interpolation by Deep ReLU Neural Networks at Irregularly Spaced Data

no code implementations2 Feb 2023 Jonathan W. Siegel

Specifically, we consider the question of how efficiently, in terms of the number of parameters, deep ReLU networks can interpolate values at $N$ datapoints in the unit ball which are separated by a distance $\delta$.

Memorization

Optimal Approximation Rates for Deep ReLU Neural Networks on Sobolev and Besov Spaces

no code implementations25 Nov 2022 Jonathan W. Siegel

We study the problem of how efficiently, in terms of the number of parameters, deep neural networks with the ReLU activation function can approximate functions in the Sobolev spaces $W^s(L_q(\Omega))$ and Besov spaces $B^s_r(L_q(\Omega))$, with error measured in the $L_p(\Omega)$ norm.

On the Activation Function Dependence of the Spectral Bias of Neural Networks

no code implementations9 Aug 2022 Qingguo Hong, Jonathan W. Siegel, Qinyang Tan, Jinchao Xu

Our empirical studies also show that neural networks with the Hat activation function are trained significantly faster using stochastic gradient descent and ADAM.

Image Classification

Sharp Lower Bounds on the Approximation Rate of Shallow Neural Networks

no code implementations28 Jun 2021 Jonathan W. Siegel, Jinchao Xu

In this article, we provide a solution to this problem by proving sharp lower bounds on the approximation rates for shallow neural networks, which are obtained by lower bounding the $L^2$-metric entropy of the convex hull of the neural network basis functions.

Characterization of the Variation Spaces Corresponding to Shallow Neural Networks

no code implementations28 Jun 2021 Jonathan W. Siegel, Jinchao Xu

We study the variation space corresponding to a dictionary of functions in $L^2(\Omega)$ for a bounded domain $\Omega\subset \mathbb{R}^d$.

Sharp Bounds on the Approximation Rates, Metric Entropy, and $n$-widths of Shallow Neural Networks

no code implementations29 Jan 2021 Jonathan W. Siegel, Jinchao Xu

This result gives sharp lower bounds on the $L^2$-approximation rates, metric entropy, and $n$-widths for variation spaces corresponding to neural networks with a range of important activation functions, including ReLU$^k$ activation functions and sigmoidal activation functions with bounded variation.

High-Order Approximation Rates for Shallow Neural Networks with Cosine and ReLU$^k$ Activation Functions

no code implementations14 Dec 2020 Jonathan W. Siegel, Jinchao Xu

We show that as the smoothness index $s$ of $f$ increases, shallow neural networks with ReLU$^k$ activation function obtain an improved approximation rate up to a best possible rate of $O(n^{-(k+1)}\log(n))$ in $L^2$, independent of the dimension $d$.

Numerical Analysis Numerical Analysis 41A25

Training Sparse Neural Networks using Compressed Sensing

1 code implementation21 Aug 2020 Jonathan W. Siegel, Jianhong Chen, Pengchuan Zhang, Jinchao Xu

The adaptive weighting we introduce corresponds to a novel regularizer based on the logarithm of the absolute value of the weights.

Approximation Rates for Neural Networks with General Activation Functions

no code implementations4 Apr 2019 Jonathan W. Siegel, Jinchao Xu

Our first result concerns the rate of approximation of a two layer neural network with a polynomially-decaying non-sigmoidal activation function.

Cannot find the paper you are looking for? You can Submit a new open access paper.