no code implementations • 13 Apr 2024 • Anastasis Kratsios, Takashi Furuya, J. Antonio Lara B., Matti Lassas, Maarten de Hoop
In this paper, we construct a mixture of neural operators (MoNOs) between function spaces whose complexity is distributed over a network of expert neural operators (NOs), with each NO satisfying parameter scaling restrictions.
1 code implementation • 8 Feb 2024 • Anastasis Kratsios, A. Martina Neuman, Gudmund Pammer
Notably, $c_{m}\in \mathcal{O}(\sqrt{m})$ for learning models on discretized Euclidean domains.
no code implementations • 5 Feb 2024 • Haitz Sáez de Ocáriz Borde, Takashi Furuya, Anastasis Kratsios, Marc T. Law
This improves the optimal bounds for traditional non-distributed deep learning models, namely ReLU MLPs, which need $\mathcal{O}(\varepsilon^{-n/2})$ parameters to achieve the same accuracy.
no code implementations • 2 Feb 2024 • Tin Sum Cheng, Aurelien Lucchi, Anastasis Kratsios, David Belius
We derive new bounds for the condition number of kernel matrices, which we then use to enhance existing non-asymptotic test error bounds for kernel ridgeless regression in the over-parameterized regime for a fixed input dimension.
no code implementations • 30 Oct 2023 • Blanka Hovart, Anastasis Kratsios, Yannick Limmer, Xuwei Yang
Deep Kalman filters (DKFs) are a class of neural network models that generate Gaussian probability measures from sequential data.
no code implementations • 23 Oct 2023 • Haitz Sáez de Ocáriz Borde, Anastasis Kratsios
Furthermore, when the latent graph can be represented in the feature space of a sufficiently regular kernel, we show that the combined neural snowflake and MLP encoder do not succumb to the curse of dimensionality by using only a low-degree polynomial number of parameters in the number of nodes.
no code implementations • 2 Oct 2023 • Alexander Kolesov, Petr Mokrov, Igor Udovichenko, Milena Gazdieva, Gudmund Pammer, Anastasis Kratsios, Evgeny Burnaev, Alexander Korotin
Optimal transport (OT) barycenters are a mathematically grounded way of averaging probability distributions while capturing their geometric properties.
1 code implementation • 8 Sep 2023 • Xuwei Yang, Anastasis Kratsios, Florian Krach, Matheus Grasselli, Aurelien Lucchi
We propose an optimal iterative scheme for federated transfer learning, where a central planner has access to datasets ${\cal D}_1,\dots,{\cal D}_N$ for the same learning model $f_{\theta}$.
no code implementations • 18 Aug 2023 • Anastasis Kratsios, Ruiyang Hong, Haitz Sáez de Ocáriz Borde
We find that the network complexity of HNN implementing the graph representation is independent of the representation fidelity/distortion.
no code implementations • 24 Apr 2023 • Anastasis Kratsios, Chong Liu, Matti Lassas, Maarten V. de Hoop, Ivan Dokmanić
Motivated by the developing mathematics of deep learning, we build universal functions approximators of continuous maps between arbitrary Polish metric spaces $\mathcal{X}$ and $\mathcal{Y}$ using elementary functions between Euclidean spaces as building blocks.
no code implementations • 17 Feb 2023 • Anastasis Kratsios, Cody Hyndman
We consider the problem of simultaneously approximating the conditional distribution of market prices and their log returns with a single machine learning model.
1 code implementation • 27 Jan 2023 • J. Antonio Lara Benitez, Takashi Furuya, Florian Faucher, Anastasis Kratsios, Xavier Tricoche, Maarten V. de Hoop
We conclude by proposing a hypernetwork version of the subfamily of NOs as a surrogate model for the mentioned forward operator.
no code implementations • 2 Nov 2022 • Songyan Hou, Parnian Kassraie, Anastasis Kratsios, Andreas Krause, Jonas Rothfuss
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
no code implementations • 24 Oct 2022 • Luca Galimberti, Anastasis Kratsios, Giulia Livieri
Causal operators (CO), such as various solution operators to stochastic differential equations, play a central role in contemporary stochastic analysis; however, there is still no canonical framework for designing Deep Learning (DL) models capable of approximating COs.
1 code implementation • NeurIPS 2023 • Anastasis Kratsios, Valentin Debarnot, Ivan Dokmanić
We derive embedding guarantees for feature maps implemented by small neural networks called \emph{probabilistic transformers}.
no code implementations • 24 Apr 2022 • Anastasis Kratsios, Behnoosh Zamanlooy
Our first main result transcribes this "structured" approximation problem into a universality problem.
no code implementations • 31 Jan 2022 • Beatrice Acciaio, Anastasis Kratsios, Gudmund Pammer
Several problems in stochastic analysis are defined through their geometry, and preserving that geometric structure is essential to generating meaningful predictions.
no code implementations • ICLR 2022 • Anastasis Kratsios, Behnoosh Zamanlooy, Tianlin Liu, Ivan Dokmanić
Many practical problems need the output of a machine learning model to satisfy a set of constraints, $K$.
no code implementations • 17 May 2021 • Anastasis Kratsios
The first strategy builds functions in $C(\mathbb{R}^d,\mathcal{P}_1(\mathbb{R}^D))$ which can be efficiently approximated by a PT, uniformly on any given compact subset of $\mathbb{R}^d$.
no code implementations • 13 Jan 2021 • Anastasis Kratsios, Leonie Papon
We show that our GDL models can approximate any continuous target function uniformly on compact sets of a controlled maximum diameter.
no code implementations • 31 Dec 2020 • Philippe Casgrain, Anastasis Kratsios
The need for fast and robust optimization algorithms are of critical importance in all areas of machine learning.
1 code implementation • 29 Oct 2020 • Anastasis Kratsios, Behnoosh Zamanlooy
Most stochastic gradient descent algorithms can optimize neural networks that are sub-differentiable in their parameters; however, this implies that the neural network's activation function must exhibit a degree of continuity which limits the neural network model's uniform approximation capacity to continuous functions.
2 code implementations • 24 Jun 2020 • Anastasis Kratsios, Behnoosh Zamanlooy
The transformed model class, denoted by $\mathscr{F}\text{-tope}$, is shown to be dense in $L^p_{\mu,\text{strict}}(\mathbb{R}^d,\mathbb{R}^D)$ which is a topological space whose elements are locally $p$-integrable functions and whose topology is much finer than usual norm topology on $L^p_{\mu}(\mathbb{R}^d,\mathbb{R}^D)$; here $\mu$ is any suitable $\sigma$-finite Borel measure $\mu$ on $\mathbb{R}^d$.
1 code implementation • NeurIPS 2020 • Anastasis Kratsios, Eugene Bilokopytov
Our result is also used to show that the common practice of randomizing all but the last two layers of a DNN produces a universal family of functions with probability one.
1 code implementation • 28 Apr 2020 • Calypso Herrera, Florian Krach, Anastasis Kratsios, Pierre Ruyssen, Josef Teichmann
The robust PCA of covariance matrices plays an essential role when isolating key explanatory features.
no code implementations • 8 Oct 2019 • Anastasis Kratsios
The universal approximation property of various machine learning models is currently only understood on a case-by-case basis, limiting the rapid development of new theoretically justified neural network architectures and blurring our understanding of our current models' potential.
no code implementations • 30 Sep 2019 • Anastasis Kratsios
When the random vector represents the payoff of derivative security in a complete financial market, its R-conditioning with respect to the risk-neutral measure is interpreted as its risk-averse value.
no code implementations • 31 Aug 2018 • Anastasis Kratsios, Cody Hyndman
We quantify the number of parameters required for this new architecture to memorize any set of input-output pairs while simultaneously fixing every point of the input space lying outside some compact set, and we quantify the size of this set as a function of our model's depth.
no code implementations • 16 Oct 2017 • Anastasis Kratsios, Cody B. Hyndman
A non-Euclidean generalization of conditional expectation is introduced and characterized as the minimizer of expected intrinsic squared-distance from a manifold-valued target.
1 code implementation • 14 Oct 2017 • Anastasis Kratsios, Cody B. Hyndman
We introduce a regularization approach to arbitrage-free factor-model selection.