no code implementations • 7 Aug 2023 • Hrushikesh Mhaskar, Tong Mao
In this paper, we present a sharper version of the results in the paper Dimension independent bounds for general shallow networks; Neural Networks, \textbf{123} (2020), 142-152.
no code implementations • 6 May 2023 • Hrushikesh Mhaskar
Motivated by applications such as invariant learning, transfer learning, and synthetic aperture radar imaging, we initiate in this paper a general approach to study the approximation capabilities of kernel based networks using non-symmetric kernels.
no code implementations • 2 Mar 2023 • Katarina Doctor, Tong Mao, Hrushikesh Mhaskar
This involves creating a grid on the hypothetical spaces of data sets and algorithms so as to identify a finite set of probability distributions from which the data sets are sampled and a finite set of algorithms.
no code implementations • 13 Feb 2022 • Hrushikesh Mhaskar
We study different smoothness classes for the operators, and also propose a method for approximation of $\mathcal{F}(F)$ using only information in a small neighborhood of $F$, resulting in an effective reduction in the number of parameters involved.
no code implementations • 4 Oct 2021 • Eric Mason, Hrushikesh Mhaskar, Adam Guo
To demonstrate the fact that our methods are agnostic to the domain knowledge, we examine the classification problem in a simple video data set.
no code implementations • 29 Sep 2021 • Srinjoy Das, Hrushikesh Mhaskar, Alexander Cloninger
Applications are demonstrated for clustering of synthetic and real-life time series and image data, and the performance of kdiff is compared to competing distance measures for clustering.
no code implementations • 3 Aug 2020 • Alexander Cloninger, Hrushikesh Mhaskar
Our approach is to consider the unknown probability measure as a convex combination of the conditional probabilities for each class.
no code implementations • 1 Aug 2019 • Hrushikesh Mhaskar
Function approximation on this unknown manifold is then a two stage procedure: first, one approximates the Laplace-Beltrami operator (and its eigen-decomposition) on this manifold using a graph Laplacian, and next, approximates the target function using the eigen-functions.
no code implementations • 17 Feb 2018 • Hrushikesh Mhaskar, Tomaso Poggio
We argue that the minimal expected value of the square loss is inappropriate to measure the generalization error in approximation of compositional functions in order to take full advantage of the compositional structure.
no code implementations • 30 Dec 2017 • Tomaso Poggio, Kenji Kawaguchi, Qianli Liao, Brando Miranda, Lorenzo Rosasco, Xavier Boix, Jack Hidary, Hrushikesh Mhaskar
In this note, we show that the dynamics associated to gradient descent minimization of nonlinear networks is topologically equivalent, near the asymptotically stable minima of the empirical error, to linear gradient system in a quadratic potential with a degenerate (for square loss) or almost degenerate (for logistic or crossentropy loss) Hessian.
no code implementations • 2 Nov 2016 • Tomaso Poggio, Hrushikesh Mhaskar, Lorenzo Rosasco, Brando Miranda, Qianli Liao
The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning.
no code implementations • 10 Aug 2016 • Hrushikesh Mhaskar, Tomaso Poggio
The paper announces new results for a non-smooth activation function - the ReLU function - used in present-day neural networks, as well as for the Gaussian networks.
no code implementations • 3 Mar 2016 • Hrushikesh Mhaskar, Qianli Liao, Tomaso Poggio
While the universal approximation property holds both for hierarchical and shallow networks, we prove that deep (hierarchical) networks can approximate the class of compositional functions with the same accuracy as shallow networks but with exponentially lower number of training parameters as well as VC-dimension.