1 code implementation • 12 Jul 2022 • Dan Mikulincer, Daniel Reichman
Our first result establishes that every monotone function over $[0, 1]^d$ can be approximated within arbitrarily small additive error by a depth-4 monotone network.
no code implementations • 17 Feb 2021 • Ronen Eldan, Dan Mikulincer, Tselil Schramm
We study the extent to which wide neural networks may be approximated by Gaussian processes when initialized with random weights.
no code implementations • NeurIPS 2020 • Sebastien Bubeck, Ronen Eldan, Yin Tat Lee, Dan Mikulincer
In contrast we propose a new training procedure for ReLU networks, based on {\em complex} (as opposed to {\em real}) recombination of the neurons, for which we show approximate memorization with both $O\left(\frac{n}{d} \cdot \frac{\log(1/\epsilon)}{\epsilon}\right)$ neurons, as well as nearly-optimal size of the weights.
no code implementations • 28 Jun 2020 • Ronen Eldan, Dan Mikulincer, Hester Pieters
We make the first steps towards generalizing the theory of stochastic block models, in the sparse regime, towards a model where the discrete community structure is replaced by an underlying geometry.
no code implementations • 4 Jun 2020 • Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Dan Mikulincer
In contrast we propose a new training procedure for ReLU networks, based on complex (as opposed to real) recombination of the neurons, for which we show approximate memorization with both $O\left(\frac{n}{d} \cdot \frac{\log(1/\epsilon)}{\epsilon}\right)$ neurons, as well as nearly-optimal size of the weights.
no code implementations • 9 Jan 2020 • Sébastien Bubeck, Dan Mikulincer
This viewpoint was explored in 1993 by Vavasis, who proposed an algorithm which, for any fixed finite dimension $d$, improves upon the $O(1/\varepsilon^2)$ oracle complexity of gradient descent.