Search Results for author: Dan Mikulincer

Found 6 papers, 1 papers with code

Size and depth of monotone neural networks: interpolation and approximation

1 code implementation • 12 Jul 2022 • Dan Mikulincer, Daniel Reichman

Our first result establishes that every monotone function over $[0, 1]^d$ can be approximated within arbitrarily small additive error by a depth-4 monotone network.

Inductive Bias

Paper
Code

Non-asymptotic approximations of neural networks by Gaussian processes

no code implementations • 17 Feb 2021 • Ronen Eldan, Dan Mikulincer, Tselil Schramm

We study the extent to which wide neural networks may be approximated by Gaussian processes when initialized with random weights.

Gaussian Processes

Paper
Add Code

Network size and size of the weights in memorization with two-layers neural networks

no code implementations • NeurIPS 2020 • Sebastien Bubeck, Ronen Eldan, Yin Tat Lee, Dan Mikulincer

In contrast we propose a new training procedure for ReLU networks, based on {\em complex} (as opposed to {\em real}) recombination of the neurons, for which we show approximate memorization with both $O\left(\frac{n}{d} \cdot \frac{\log(1/\epsilon)}{\epsilon}\right)$ neurons, as well as nearly-optimal size of the weights.

Memorization

Paper
Add Code

Community detection and percolation of information in a geometric setting

no code implementations • 28 Jun 2020 • Ronen Eldan, Dan Mikulincer, Hester Pieters

We make the first steps towards generalizing the theory of stochastic block models, in the sparse regime, towards a model where the discrete community structure is replaced by an underlying geometry.

Community Detection

Paper
Add Code

Network size and weights size for memorization with two-layers neural networks

no code implementations • 4 Jun 2020 • Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Dan Mikulincer

In contrast we propose a new training procedure for ReLU networks, based on complex (as opposed to real) recombination of the neurons, for which we show approximate memorization with both $O\left(\frac{n}{d} \cdot \frac{\log(1/\epsilon)}{\epsilon}\right)$ neurons, as well as nearly-optimal size of the weights.

Memorization

Paper
Add Code

How to trap a gradient flow

no code implementations • 9 Jan 2020 • Sébastien Bubeck, Dan Mikulincer

This viewpoint was explored in 1993 by Vavasis, who proposed an algorithm which, for any fixed finite dimension $d$, improves upon the $O(1/\varepsilon^2)$ oracle complexity of gradient descent.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.