Search Results for author: Itay Safran

Found 12 papers, 3 papers with code

Depth Separations in Neural Networks: Separating the Dimension from the Accuracy

no code implementations • 11 Feb 2024 • Itay Safran, Daniel Reichman, Paul Valiant

We prove an exponential separation between depth 2 and depth 3 neural networks, when approximating an $\mathcal{O}(1)$-Lipschitz target function to constant accuracy, with respect to a distribution with support in $[0, 1]^{d}$, assuming exponentially bounded weights.

Paper
Add Code

How Many Neurons Does it Take to Approximate the Maximum?

no code implementations • 18 Jul 2023 • Itay Safran, Daniel Reichman, Paul Valiant

Our depth separation results are facilitated by a new lower bound for depth 2 networks approximating the maximum function over the uniform distribution, assuming an exponential upper bound on the size of the weights.

Paper
Add Code

On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias

no code implementations • 18 May 2022 • Itay Safran, Gal Vardi, Jason D. Lee

We study the dynamics and implicit bias of gradient flow (GF) on univariate ReLU neural networks with a single hidden layer in a binary classification setting.

Binary Classification

Paper
Add Code

Optimization-Based Separations for Neural Networks

no code implementations • 4 Dec 2021 • Itay Safran, Jason D. Lee

Depth separation results propose a possible theoretical explanation for the benefits of deep neural networks over shallower architectures, establishing that the former possess superior approximation capabilities.

Paper
Add Code

Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems

1 code implementation • NeurIPS 2021 • Itay Safran, Ohad Shamir

Perhaps surprisingly, we prove that when the condition number is taken into account, without-replacement SGD \emph{does not} significantly improve on with-replacement SGD in terms of worst-case bounds, unless the number of epochs (passes over the data) is larger than the condition number.

Paper
Code

The Effects of Mild Over-parameterization on the Optimization Landscape of Shallow ReLU Neural Networks

1 code implementation • 1 Jun 2020 • Itay Safran, Gilad Yehudai, Ohad Shamir

We prove that while the objective is strongly convex around the global minima when the teacher and student networks possess the same number of neurons, it is not even \emph{locally convex} after any amount of over-parameterization.

Paper
Code

How Good is SGD with Random Shuffling?

no code implementations • 31 Jul 2019 • Itay Safran, Ohad Shamir

In contrast to the majority of existing theoretical works, which assume that individual functions are sampled with replacement, we focus here on popular but poorly-understood heuristics, which involve going over random permutations of the individual functions.

Paper
Add Code

Depth Separations in Neural Networks: What is Actually Being Separated?

no code implementations • 15 Apr 2019 • Itay Safran, Ronen Eldan, Ohad Shamir

Existing depth separation results for constant-depth networks essentially show that certain radial functions in $\mathbb{R}^d$, which can be easily approximated with depth $3$ networks, cannot be approximated by depth $2$ networks, even up to constant accuracy, unless their size is exponential in $d$.

Paper
Add Code

A Simple Explanation for the Existence of Adversarial Examples with Small Hamming Distance

no code implementations • 30 Jan 2019 • Adi Shamir, Itay Safran, Eyal Ronen, Orr Dunkelman

The existence of adversarial examples in which an imperceptible change in the input can fool well trained neural networks was experimentally discovered by Szegedy et al in 2013, who called them "Intriguing properties of neural networks".

Paper
Add Code

Spurious Local Minima are Common in Two-Layer ReLU Neural Networks

1 code implementation • ICML 2018 • Itay Safran, Ohad Shamir

We consider the optimization problem associated with training simple ReLU neural networks of the form $\mathbf{x}\mapsto \sum_{i=1}^{k}\max\{0,\mathbf{w}_i^\top \mathbf{x}\}$ with respect to the squared loss.

Vocal Bursts Valence Prediction

Paper
Code

Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks

no code implementations • ICML 2017 • Itay Safran, Ohad Shamir

We provide several new depth-based separation results for feed-forward neural networks, proving that various types of simple and natural functions can be better approximated using deeper networks than shallower ones, even if the shallower networks are much larger.

Paper
Add Code

On the Quality of the Initial Basin in Overspecified Neural Networks

no code implementations • 13 Nov 2015 • Itay Safran, Ohad Shamir

Deep learning, in the form of artificial neural networks, has achieved remarkable practical success in recent years, for a variety of difficult machine learning applications.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.