You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • NeurIPS 2021 • Alon Cohen, Amit Daniely, Yoel Drori, Tomer Koren, Mariano Schain

In the general non-convex smooth optimization setting, we give a simple and efficient algorithm that requires $O( \sigma^2/\epsilon^4 + \tau/\epsilon^2 )$ steps for finding an $\epsilon$-stationary point $x$, where $\tau$ is the \emph{average} delay $\smash{\frac{1}{T}\sum_{t=1}^T d_t}$ and $\sigma^2$ is the variance of the stochastic gradients.

no code implementations • 20 May 2021 • Amit Daniely, Elad Granot

As machine learning increasingly becomes more prevalent in our everyday life, many organizations offer neural-networks based services as a black-box.

no code implementations • 20 Jan 2021 • Amit Daniely, Gal Vardi

We also establish lower bounds on the complexity of learning intersections of a constant number of halfspaces, and ReLU networks with a constant number of hidden neurons.

no code implementations • NeurIPS 2020 • Amit Daniely, Hadas Schacham

We consider ReLU networks with random weights, in which the dimension decreases at each layer.

no code implementations • NeurIPS 2020 • Amit Daniely, Gal Vardi

A natural approach to settle the discrepancy is to assume that the network's weights are "well-behaved" and posses some generic properties that may allow efficient learning.

no code implementations • 28 Mar 2020 • Amit Daniely

We prove that a single step of gradient decent over depth two network, with $q$ hidden neurons, starting from orthogonal initialization, can memorize $\Omega\left(\frac{dq}{\log^4(d)}\right)$ independent and randomly labeled Gaussians in $\mathbb{R}^d$.

no code implementations • NeurIPS 2020 • Amit Daniely, Eran Malach

On the other hand, under the same distributions, these parities cannot be learned efficiently by linear methods.

no code implementations • 9 Feb 2020 • Yossi Arjevani, Amit Daniely, Stefanie Jegelka, Hongzhou Lin

Recent advances in randomized incremental methods for minimizing $L$-smooth $\mu$-strongly convex finite sums have culminated in tight complexity of $\tilde{O}((n+\sqrt{n L/\mu})\log(1/\epsilon))$ and $O(n+\sqrt{nL/\epsilon})$, where $\mu>0$ and $\mu=0$, respectively, and $n$ denotes the number of individual functions.

no code implementations • NeurIPS 2020 • Amit Daniely

Many results in recent years established polynomial time learnability of various models via neural networks algorithms.

no code implementations • NeurIPS 2019 • Amit Daniely, Elad Granot

We show that for any depth $t$, if the inputs are in $[-1, 1]^d$, the sample complexity of $H$ is $\tilde O\left(\frac{dR^2}{\epsilon^2}\right)$.

1 code implementation • ICLR 2020 • Daniel Gissin, Shai Shalev-Shwartz, Amit Daniely

A leading hypothesis for the surprising generalization of neural networks is that the dynamics of gradient descent bias the model towards simple solutions, by searching through the solution space in an incremental order of complexity.

no code implementations • 11 Jul 2019 • Alon Brutzkus, Amit Daniely, Eran Malach

Since its inception in the 1980s, ID3 has become one of the most successful and widely used algorithms for learning decision trees.

no code implementations • 20 Jun 2019 • Alon Brutzkus, Amit Daniely, Eran Malach

In recent years, there are many attempts to understand popular heuristics.

no code implementations • 7 Apr 2019 • Amit Daniely, Yishay Mansour

Our end result is an online algorithm that can combine a "base" online algorithm, having a guaranteed competitive ratio, with a range of online algorithms that guarantee a small regret over any interval of time.

no code implementations • NeurIPS 2019 • Amit Daniely, Vitaly Feldman

The only lower bound we are aware of is for PAC learning an artificial class of functions with respect to a uniform distribution (Kasiviswanathan et al. 2011).

no code implementations • 7 May 2018 • Craig Boutilier, Alon Cohen, Amit Daniely, Avinatan Hassidim, Yishay Mansour, Ofer Meshi, Martin Mladenov, Dale Schuurmans

From an RL perspective, we show that Q-learning with sampled action sets is sound.

no code implementations • 8 Mar 2018 • Deborah Cohen, Amit Daniely, Amir Globerson, Gal Elidan

Complex classifiers may exhibit "embarassing" failures in cases where humans can easily provide a justified classification.

no code implementations • 22 Mar 2017 • Amit Daniely, Roy Frostig, Vineet Gupta, Yoram Singer

We describe and analyze a simple random feature scheme (RFS) from prescribed compositional kernels.

no code implementations • NeurIPS 2017 • Amit Daniely

We show that the standard stochastic gradient decent (SGD) algorithm is guaranteed to learn, in polynomial time, a function that is competitive with the best function in the conjugate kernel space of the network, as defined in Daniely, Frostig and Singer.

no code implementations • 27 Feb 2017 • Amit Daniely

As many functions of the above form can be well approximated by poly-size depth three networks with poly-bounded weights, this establishes a separation between depth two and depth three networks w. r. t.\ the uniform distribution on $\mathbb{S}^{d-1}\times \mathbb{S}^{d-1}$.

no code implementations • 30 Nov 2016 • Gali Noti, Effi Levi, Yoav Kolumbus, Amit Daniely

A large body of work in behavioral fields attempts to develop models that describe the way people, as opposed to rational agents, make decisions.

no code implementations • 19 Apr 2016 • Amit Daniely, Nevena Lazic, Yoram Singer, Kunal Talwar

In stark contrast, our approach of using improper learning, using a larger hypothesis class allows the sketch size to have a logarithmic dependence on the degree.

no code implementations • 11 Mar 2016 • Galit Bary-Weisberg, Amit Daniely, Shai Shalev-Shwartz

The model of learning with \emph{local membership queries} interpolates between the PAC model and the membership queries model by allowing the learner to query the label of any example that is similar to an example in the training set.

no code implementations • NeurIPS 2016 • Amit Daniely, Roy Frostig, Yoram Singer

We develop a general duality between neural networks and compositional kernels, striving towards a better understanding of deep learning.

no code implementations • 21 May 2015 • Amit Daniely

We show that no efficient learning algorithm has non-trivial worst-case performance even under the guarantees that $\mathrm{Err}_{\mathrm{HALF}}(\mathcal{D}) \le \eta$ for arbitrarily small constant $\eta>0$, and that $\mathcal{D}$ is supported in $\{\pm 1\}^n\times \{\pm 1\}$.

no code implementations • 25 Feb 2015 • Amit Daniely, Alon Gonen, Shai Shalev-Shwartz

Strongly adaptive algorithms are algorithms whose performance on every time interval is close to optimal.

no code implementations • 26 Oct 2014 • Amit Daniely

We present a PTAS for agnostically learning halfspaces w. r. t.

no code implementations • 30 Jul 2014 • Maria-Florina Balcan, Amit Daniely, Ruta Mehta, Ruth Urner, Vijay V. Vazirani

In this work we advance this line of work by providing sample complexity guarantees and efficient algorithms for a number of important classes.

no code implementations • 10 May 2014 • Amit Daniely, Shai Shalev-Shwartz

Furthermore, we show that the sample complexity of these learners is better than the sample complexity of the ERM rule, thus settling in negative an open question due to Collins (2005).

no code implementations • 13 Apr 2014 • Amit Daniely, Shai Shalev-Shwatz

Using the recently developed framework of [Daniely et al, 2014], we show that under a natural assumption on the complexity of refuting random K-SAT formulas, learning DNF formulas is hard.

no code implementations • 10 Nov 2013 • Amit Daniely, Nati Linial, Shai Shalev-Shwartz

The biggest challenge in proving complexity results is to establish hardness of {\em improper learning} (a. k. a.

no code implementations • NeurIPS 2013 • Amit Daniely, Nati Linial, Shai Shalev Shwartz

That is, if more data is available, beyond the sample complexity limit, is it possible to use the extra examples to speed up the computation time required to perform the learning task?

no code implementations • 13 Aug 2013 • Amit Daniely, Sivan Sabato, Shai Ben-David, Shai Shalev-Shwartz

We study the sample complexity of multiclass prediction in several learning settings.

no code implementations • 5 Feb 2013 • Amit Daniely, Tom Helbertal

We consider two scenarios of multiclass online learning of a hypothesis class $H\subseteq Y^X$.

no code implementations • NeurIPS 2012 • Amit Daniely, Sivan Sabato, Shai S. Shwartz

We analyze both the estimation error and the approximation error of these methods.

no code implementations • 3 Nov 2012 • Amit Daniely, Nati Linial, Shai Shalev-Shwartz

The best approximation ratio achievable by an efficient algorithm is $O\left(\frac{1/\gamma}{\sqrt{\log(1/\gamma)}}\right)$ and is achieved using an algorithm from the above class.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.