Search Results for author: Gilad Yehudai

Found 16 papers, 2 papers with code

RedEx: Beyond Fixed Representation Methods via Convex Optimization

no code implementations15 Jan 2024 Amit Daniely, Mariano Schain, Gilad Yehudai

Optimizing Neural networks is a difficult task which is still not well understood.

Locally Optimal Descent for Dynamic Stepsize Scheduling

no code implementations23 Nov 2023 Gilad Yehudai, Alon Cohen, Amit Daniely, Yoel Drori, Tomer Koren, Mariano Schain

We introduce a novel dynamic learning-rate scheduling scheme grounded in theory with the goal of simplifying the manual and time-consuming tuning of schedules in practice.

Scheduling Stochastic Optimization

From Tempered to Benign Overfitting in ReLU Neural Networks

no code implementations NeurIPS 2023 Guy Kornowski, Gilad Yehudai, Ohad Shamir

Thus, we show that the input dimension has a crucial role on the type of overfitting in this setting, which we also validate empirically for intermediate dimensions.

Reconstructing Training Data from Multiclass Neural Networks

no code implementations5 May 2023 Gon Buzaglo, Niv Haim, Gilad Yehudai, Gal Vardi, Michal Irani

Reconstructing samples from the training set of trained neural networks is a major privacy concern.

Binary Classification

Reconstructing Training Data from Trained Neural Networks

1 code implementation15 Jun 2022 Niv Haim, Gal Vardi, Gilad Yehudai, Ohad Shamir, Michal Irani

We propose a novel reconstruction scheme that stems from recent theoretical results about the implicit bias in training neural networks with gradient-based methods.

Gradient Methods Provably Converge to Non-Robust Networks

no code implementations9 Feb 2022 Gal Vardi, Gilad Yehudai, Ohad Shamir

Despite a great deal of research, it is still unclear why neural networks are so susceptible to adversarial examples.

Width is Less Important than Depth in ReLU Neural Networks

no code implementations8 Feb 2022 Gal Vardi, Gilad Yehudai, Ohad Shamir

We solve an open question from Lu et al. (2017), by showing that any target network with inputs in $\mathbb{R}^d$ can be approximated by a width $O(d)$ network (independent of the target network's architecture), whose number of parameters is essentially larger only by a linear factor.

Open-Ended Question Answering

On the Optimal Memorization Power of ReLU Neural Networks

no code implementations ICLR 2022 Gal Vardi, Gilad Yehudai, Ohad Shamir

We prove that having such a large bit complexity is both necessary and sufficient for memorization with a sub-linear number of parameters.

Memorization

Learning a Single Neuron with Bias Using Gradient Descent

no code implementations NeurIPS 2021 Gal Vardi, Gilad Yehudai, Ohad Shamir

We theoretically study the fundamental problem of learning a single neuron with a bias term ($\mathbf{x} \mapsto \sigma(<\mathbf{w},\mathbf{x}> + b)$) in the realizable setting with the ReLU activation, using gradient descent.

The Connection Between Approximation, Depth Separation and Learnability in Neural Networks

no code implementations31 Jan 2021 Eran Malach, Gilad Yehudai, Shai Shalev-Shwartz, Ohad Shamir

On the other hand, the fact that deep networks can efficiently express a target function does not mean that this target function can be learned efficiently by deep neural networks.

From Local Structures to Size Generalization in Graph Neural Networks

no code implementations17 Oct 2020 Gilad Yehudai, Ethan Fetaya, Eli Meirom, Gal Chechik, Haggai Maron

In this paper, we identify an important type of data where generalization from small to large graphs is challenging: graph distributions for which the local structure depends on the graph size.

Combinatorial Optimization Domain Adaptation +2

On Size Generalization in Graph Neural Networks

no code implementations28 Sep 2020 Gilad Yehudai, Ethan Fetaya, Eli Meirom, Gal Chechik, Haggai Maron

We further demonstrate on several tasks, that training GNNs on small graphs results in solutions which do not generalize to larger graphs.

Combinatorial Optimization Domain Adaptation +1

The Effects of Mild Over-parameterization on the Optimization Landscape of Shallow ReLU Neural Networks

1 code implementation1 Jun 2020 Itay Safran, Gilad Yehudai, Ohad Shamir

We prove that while the objective is strongly convex around the global minima when the teacher and student networks possess the same number of neurons, it is not even \emph{locally convex} after any amount of over-parameterization.

Proving the Lottery Ticket Hypothesis: Pruning is All You Need

no code implementations ICML 2020 Eran Malach, Gilad Yehudai, Shai Shalev-Shwartz, Ohad Shamir

The lottery ticket hypothesis (Frankle and Carbin, 2018), states that a randomly-initialized network contains a small subnetwork such that, when trained in isolation, can compete with the performance of the original network.

Learning a Single Neuron with Gradient Methods

no code implementations15 Jan 2020 Gilad Yehudai, Ohad Shamir

We consider the fundamental problem of learning a single neuron $x \mapsto\sigma(w^\top x)$ using standard gradient methods.

On the Power and Limitations of Random Features for Understanding Neural Networks

no code implementations NeurIPS 2019 Gilad Yehudai, Ohad Shamir

Recently, a spate of papers have provided positive theoretical results for training over-parameterized neural networks (where the network size is larger than what is needed to achieve low error).

Cannot find the paper you are looking for? You can Submit a new open access paper.