Search Results for author: Amit Daniely

Found 36 papers, 1 papers with code

Asynchronous Stochastic Optimization Robust to Arbitrary Delays

no code implementations NeurIPS 2021 Alon Cohen, Amit Daniely, Yoel Drori, Tomer Koren, Mariano Schain

In the general non-convex smooth optimization setting, we give a simple and efficient algorithm that requires $O( \sigma^2/\epsilon^4 + \tau/\epsilon^2 )$ steps for finding an $\epsilon$-stationary point $x$, where $\tau$ is the \emph{average} delay $\smash{\frac{1}{T}\sum_{t=1}^T d_t}$ and $\sigma^2$ is the variance of the stochastic gradients.

Distributed Optimization

An Exact Poly-Time Membership-Queries Algorithm for Extraction a three-Layer ReLU Network

no code implementations20 May 2021 Amit Daniely, Elad Granot

As machine learning increasingly becomes more prevalent in our everyday life, many organizations offer neural-networks based services as a black-box.

From Local Pseudorandom Generators to Hardness of Learning

no code implementations20 Jan 2021 Amit Daniely, Gal Vardi

We also establish lower bounds on the complexity of learning intersections of a constant number of halfspaces, and ReLU networks with a constant number of hidden neurons.

Most ReLU Networks Suffer from $\ell^2$ Adversarial Perturbations

no code implementations NeurIPS 2020 Amit Daniely, Hadas Schacham

We consider ReLU networks with random weights, in which the dimension decreases at each layer.

Hardness of Learning Neural Networks with Natural Weights

no code implementations NeurIPS 2020 Amit Daniely, Gal Vardi

A natural approach to settle the discrepancy is to assume that the network's weights are "well-behaved" and posses some generic properties that may allow efficient learning.

Memorizing Gaussians with no over-parameterizaion via gradient decent on neural networks

no code implementations28 Mar 2020 Amit Daniely

We prove that a single step of gradient decent over depth two network, with $q$ hidden neurons, starting from orthogonal initialization, can memorize $\Omega\left(\frac{dq}{\log^4(d)}\right)$ independent and randomly labeled Gaussians in $\mathbb{R}^d$.

Learning Parities with Neural Networks

no code implementations NeurIPS 2020 Amit Daniely, Eran Malach

On the other hand, under the same distributions, these parities cannot be learned efficiently by linear methods.

On the Complexity of Minimizing Convex Finite Sums Without Using the Indices of the Individual Functions

no code implementations9 Feb 2020 Yossi Arjevani, Amit Daniely, Stefanie Jegelka, Hongzhou Lin

Recent advances in randomized incremental methods for minimizing $L$-smooth $\mu$-strongly convex finite sums have culminated in tight complexity of $\tilde{O}((n+\sqrt{n L/\mu})\log(1/\epsilon))$ and $O(n+\sqrt{nL/\epsilon})$, where $\mu>0$ and $\mu=0$, respectively, and $n$ denotes the number of individual functions.

Neural Networks Learning and Memorization with (almost) no Over-Parameterization

no code implementations NeurIPS 2020 Amit Daniely

Many results in recent years established polynomial time learnability of various models via neural networks algorithms.

Generalization Bounds for Neural Networks via Approximate Description Length

no code implementations NeurIPS 2019 Amit Daniely, Elad Granot

We show that for any depth $t$, if the inputs are in $[-1, 1]^d$, the sample complexity of $H$ is $\tilde O\left(\frac{dR^2}{\epsilon^2}\right)$.

Generalization Bounds

The Implicit Bias of Depth: How Incremental Learning Drives Generalization

1 code implementation ICLR 2020 Daniel Gissin, Shai Shalev-Shwartz, Amit Daniely

A leading hypothesis for the surprising generalization of neural networks is that the dynamics of gradient descent bias the model towards simple solutions, by searching through the solution space in an incremental order of complexity.

Incremental Learning

On the Optimality of Trees Generated by ID3

no code implementations11 Jul 2019 Alon Brutzkus, Amit Daniely, Eran Malach

Since its inception in the 1980s, ID3 has become one of the most successful and widely used algorithms for learning decision trees.

ID3 Learns Juntas for Smoothed Product Distributions

no code implementations20 Jun 2019 Alon Brutzkus, Amit Daniely, Eran Malach

In recent years, there are many attempts to understand popular heuristics.

Competitive ratio versus regret minimization: achieving the best of both worlds

no code implementations7 Apr 2019 Amit Daniely, Yishay Mansour

Our end result is an online algorithm that can combine a "base" online algorithm, having a guaranteed competitive ratio, with a range of online algorithms that guarantee a small regret over any interval of time.

Locally Private Learning without Interaction Requires Separation

no code implementations NeurIPS 2019 Amit Daniely, Vitaly Feldman

The only lower bound we are aware of is for PAC learning an artificial class of functions with respect to a uniform distribution (Kasiviswanathan et al. 2011).

Learning Rules-First Classifiers

no code implementations8 Mar 2018 Deborah Cohen, Amit Daniely, Amir Globerson, Gal Elidan

Complex classifiers may exhibit "embarassing" failures in cases where humans can easily provide a justified classification.

General Classification Sentiment Analysis

Random Features for Compositional Kernels

no code implementations22 Mar 2017 Amit Daniely, Roy Frostig, Vineet Gupta, Yoram Singer

We describe and analyze a simple random feature scheme (RFS) from prescribed compositional kernels.

SGD Learns the Conjugate Kernel Class of the Network

no code implementations NeurIPS 2017 Amit Daniely

We show that the standard stochastic gradient decent (SGD) algorithm is guaranteed to learn, in polynomial time, a function that is competitive with the best function in the conjugate kernel space of the network, as defined in Daniely, Frostig and Singer.

Depth Separation for Neural Networks

no code implementations27 Feb 2017 Amit Daniely

As many functions of the above form can be well approximated by poly-size depth three networks with poly-bounded weights, this establishes a separation between depth two and depth three networks w. r. t.\ the uniform distribution on $\mathbb{S}^{d-1}\times \mathbb{S}^{d-1}$.

Behavior-Based Machine-Learning: A Hybrid Approach for Predicting Human Decision Making

no code implementations30 Nov 2016 Gali Noti, Effi Levi, Yoav Kolumbus, Amit Daniely

A large body of work in behavioral fields attempts to develop models that describe the way people, as opposed to rational agents, make decisions.

Decision Making

Sketching and Neural Networks

no code implementations19 Apr 2016 Amit Daniely, Nevena Lazic, Yoram Singer, Kunal Talwar

In stark contrast, our approach of using improper learning, using a larger hypothesis class allows the sketch size to have a logarithmic dependence on the degree.

Distribution Free Learning with Local Queries

no code implementations11 Mar 2016 Galit Bary-Weisberg, Amit Daniely, Shai Shalev-Shwartz

The model of learning with \emph{local membership queries} interpolates between the PAC model and the membership queries model by allowing the learner to query the label of any example that is similar to an example in the training set.

Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity

no code implementations NeurIPS 2016 Amit Daniely, Roy Frostig, Yoram Singer

We develop a general duality between neural networks and compositional kernels, striving towards a better understanding of deep learning.

Complexity Theoretic Limitations on Learning Halfspaces

no code implementations21 May 2015 Amit Daniely

We show that no efficient learning algorithm has non-trivial worst-case performance even under the guarantees that $\mathrm{Err}_{\mathrm{HALF}}(\mathcal{D}) \le \eta$ for arbitrarily small constant $\eta>0$, and that $\mathcal{D}$ is supported in $\{\pm 1\}^n\times \{\pm 1\}$.

Strongly Adaptive Online Learning

no code implementations25 Feb 2015 Amit Daniely, Alon Gonen, Shai Shalev-Shwartz

Strongly adaptive algorithms are algorithms whose performance on every time interval is close to optimal.

A PTAS for Agnostically Learning Halfspaces

no code implementations26 Oct 2014 Amit Daniely

We present a PTAS for agnostically learning halfspaces w. r. t.

Learning Economic Parameters from Revealed Preferences

no code implementations30 Jul 2014 Maria-Florina Balcan, Amit Daniely, Ruta Mehta, Ruth Urner, Vijay V. Vazirani

In this work we advance this line of work by providing sample complexity guarantees and efficient algorithms for a number of important classes.

Optimal Learners for Multiclass Problems

no code implementations10 May 2014 Amit Daniely, Shai Shalev-Shwartz

Furthermore, we show that the sample complexity of these learners is better than the sample complexity of the ERM rule, thus settling in negative an open question due to Collins (2005).

Complexity theoretic limitations on learning DNF's

no code implementations13 Apr 2014 Amit Daniely, Shai Shalev-Shwatz

Using the recently developed framework of [Daniely et al, 2014], we show that under a natural assumption on the complexity of refuting random K-SAT formulas, learning DNF formulas is hard.

From average case complexity to improper learning complexity

no code implementations10 Nov 2013 Amit Daniely, Nati Linial, Shai Shalev-Shwartz

The biggest challenge in proving complexity results is to establish hardness of {\em improper learning} (a. k. a.

Learning Theory

More data speeds up training time in learning halfspaces over sparse vectors

no code implementations NeurIPS 2013 Amit Daniely, Nati Linial, Shai Shalev Shwartz

That is, if more data is available, beyond the sample complexity limit, is it possible to use the extra examples to speed up the computation time required to perform the learning task?

Multiclass learnability and the ERM principle

no code implementations13 Aug 2013 Amit Daniely, Sivan Sabato, Shai Ben-David, Shai Shalev-Shwartz

We study the sample complexity of multiclass prediction in several learning settings.

General Classification

The price of bandit information in multiclass online classification

no code implementations5 Feb 2013 Amit Daniely, Tom Helbertal

We consider two scenarios of multiclass online learning of a hypothesis class $H\subseteq Y^X$.

Classification General Classification

The complexity of learning halfspaces using generalized linear methods

no code implementations3 Nov 2012 Amit Daniely, Nati Linial, Shai Shalev-Shwartz

The best approximation ratio achievable by an efficient algorithm is $O\left(\frac{1/\gamma}{\sqrt{\log(1/\gamma)}}\right)$ and is achieved using an algorithm from the above class.

Cannot find the paper you are looking for? You can Submit a new open access paper.