no code implementations • 2 Oct 2023 • Annan Yu, Arnur Nigmetov, Dmitriy Morozov, Michael W. Mahoney, N. Benjamin Erichson
An example is the structured state-space sequence (S4) layer, which uses the diagonal-plus-low-rank structure of the HiPPO initialization framework.
no code implementations • 28 May 2022 • Annan Yu, Yunan Yang, Alex Townsend
Small generalization errors of over-parameterized neural networks (NNs) can be partially explained by the frequency biasing phenomenon, where gradient-based algorithms minimize the low-frequency misfit before reducing the high-frequency residuals.
no code implementations • 23 Sep 2021 • Annan Yu, Chloé Becquey, Diana Halikias, Matthew Esmaili Mallory, Alex Townsend
Here, we prove that operator NNs of bounded width and arbitrary depth are universal approximators for continuous nonlinear operators.