Search Results for author: Dmitry Yarotsky

Found 17 papers, 5 papers with code

Generalization error of spectral algorithms

no code implementations • 18 Mar 2024 • Maksim Velikanov, Maxim Panov, Dmitry Yarotsky

In the present work, we consider the training of kernels with a family of $\textit{spectral algorithms}$ specified by profile $h(\lambda)$, and including KRR and GD as special cases.

Paper
Add Code

Learning high-dimensional targets by two-parameter models and gradient flow

no code implementations • 26 Feb 2024 • Dmitry Yarotsky

Our main result shows that if the targets are described by a particular $d$-dimensional probability distribution, then there exist models with as few as two parameters that can learn the targets with arbitrarily high success probability.

Paper
Add Code

A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta

1 code implementation • 22 Jun 2022 • Maksim Velikanov, Denis Kuznedelev, Dmitry Yarotsky

Mini-batch SGD with momentum is a fundamental algorithm for learning large predictive models.

Paper
Code

Embedded Ensembles: Infinite Width Limit and Operating Regimes

no code implementations • 24 Feb 2022 • Maksim Velikanov, Roman Kail, Ivan Anokhin, Roman Vashurin, Maxim Panov, Alexey Zaytsev, Dmitry Yarotsky

In this limit, we identify two ensemble regimes - independent and collective - depending on the architecture and initialization strategy of ensemble models.

Paper
Add Code

Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions

no code implementations • 2 Feb 2022 • Maksim Velikanov, Dmitry Yarotsky

In this paper, we propose a new spectral condition providing tighter upper bounds for problems with power law optimization trajectories.

Paper
Add Code

Explicit loss asymptotics in the gradient descent training of neural networks

no code implementations • NeurIPS 2021 • Maksim Velikanov, Dmitry Yarotsky

Current theoretical results on optimization trajectories of neural networks trained by gradient descent typically have the form of rigorous but potentially loose bounds on the loss values.

Paper
Add Code

Universal scaling laws in the gradient descent training of neural networks

no code implementations • 2 May 2021 • Maksim Velikanov, Dmitry Yarotsky

Current theoretical results on optimization trajectories of neural networks trained by gradient descent typically have the form of rigorous but potentially loose bounds on the loss values.

Paper
Add Code

Elementary superexpressive activations

no code implementations • 22 Feb 2021 • Dmitry Yarotsky

We call a finite family of activation functions superexpressive if any multivariate continuous function can be approximated by a neural network that uses these activations and has a fixed architecture only depending on the number of input variables (i. e., to achieve any accuracy we only need to adjust the weights, without increasing the number of neurons).

Paper
Add Code

Low-loss connection of weight vectors: distribution-based approaches

1 code implementation • ICML 2020 • Ivan Anokhin, Dmitry Yarotsky

Recent research shows that sublevel sets of the loss surfaces of overparameterized networks are connected, exactly or approximately.

Paper
Code

The phase diagram of approximation rates for deep neural networks

no code implementations • NeurIPS 2020 • Dmitry Yarotsky, Anton Zhevnerchuk

We explore the phase diagram of approximation rates for deep neural networks and prove several new theoretical results.

Paper
Add Code

Collective evolution of weights in wide neural networks

no code implementations • 9 Oct 2018 • Dmitry Yarotsky

We test our general method in the special case of linear free-knot splines, and find good agreement between theory and experiment in observations of global optima, stability of stationary points, and convergence rates.

Paper
Add Code

Universal approximations of invariant maps by neural networks

no code implementations • 26 Apr 2018 • Dmitry Yarotsky

We prove this model to be a universal approximator for continuous SE(2)--equivariant signal transformations.

Paper
Add Code

Optimal approximation of continuous functions by very deep ReLU networks

no code implementations • 10 Feb 2018 • Dmitry Yarotsky

We consider approximations of general continuous functions on finite-dimensional cubes by general deep ReLU neural networks and study the approximation rates with respect to the modulus of continuity of the function and the total number of weights $W$ in the network.

Paper
Add Code

Quantified advantage of discontinuous weight selection in approximations with deep neural networks

no code implementations • 3 May 2017 • Dmitry Yarotsky

We consider approximations of 1D Lipschitz functions by deep ReLU networks of a fixed width.

Paper
Add Code

Geometric features for voxel-based surface recognition

1 code implementation • 16 Jan 2017 • Dmitry Yarotsky

We introduce a library of geometric voxel features for CAD surface recognition/retrieval tasks.

General Classification Retrieval

Paper
Code

Error bounds for approximations with deep ReLU networks

2 code implementations • 3 Oct 2016 • Dmitry Yarotsky

We study expressive power of shallow and deep neural networks with piece-wise linear activation functions.

Paper
Code

GTApprox: surrogate modeling for industrial design

1 code implementation • 5 Sep 2016 • Mikhail Belyaev, Evgeny Burnaev, Ermek Kapushev, Maxim Panov, Pavel Prikhodko, Dmitry Vetrov, Dmitry Yarotsky

We describe GTApprox - a new tool for medium-scale surrogate modeling in industrial design.

Model Selection

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.