Search Results for author: Assaf Schuster

Found 11 papers, 2 papers with code

FOSI: Hybrid First and Second Order Optimization

1 code implementation16 Feb 2023 Hadar Sivan, Moshe Gabel, Assaf Schuster

Popular machine learning approaches forgo second-order information due to the difficulty of computing curvature in high dimensions.

Audio Classification Language Modelling +2

Unsupervised Frequent Pattern Mining for CEP

no code implementations28 Jul 2022 Guy Shapira, Assaf Schuster

We present REDEEMER (REinforcement baseD cEp pattErn MinER), a novel reinforcement and active learning approach aimed at mining CEP patterns that allow expansion of the knowledge extracted while reducing the human effort required.

Active Learning Descriptive +1

Coin Flipping Neural Networks

no code implementations18 Jun 2022 Yuval Sieradzki, Nitzan Hodos, Gal Yehuda, Assaf Schuster

We show that a CFNN can approximate the indicator of a $d$-dimensional ball to arbitrary accuracy with only 2 layers and $\mathcal{O}(1)$ neurons, where a 2-layer deterministic network was shown to require $\Omega(e^d)$ neurons, an exponential improvement (arXiv:1610. 09887).

Learning Under Delayed Feedback: Implicitly Adapting to Gradient Delays

no code implementations23 Jun 2021 Rotem Zamir Aviv, Ido Hakimi, Assaf Schuster, Kfir Y. Levy

We consider stochastic convex optimization problems, where several machines act asynchronously in parallel while sharing a common memory.

Gap-Aware Mitigation of Gradient Staleness

no code implementations ICLR 2020 Saar Barkai, Ido Hakimi, Assaf Schuster

In this paper we define the Gap as a measure of gradient staleness and propose Gap-Aware (GA), a novel asynchronous-distributed method that penalizes stale gradients linearly to the Gap and performs well even when scaling to large numbers of workers.

Cloud Computing

Adaptive Communication Bounds for Distributed Online Learning

no code implementations28 Nov 2019 Michael Kamp, Mario Boley, Michael Mock, Daniel Keren, Assaf Schuster, Izchak Sharfman

The learning performance of such a protocol is intuitively optimal if approximately the same loss is incurred as in a hypothetical serial setting.

Gap Aware Mitigation of Gradient Staleness

no code implementations24 Sep 2019 Saar Barkai, Ido Hakimi, Assaf Schuster

In this paper we define the Gap as a measure of gradient staleness and propose Gap-Aware (GA), a novel asynchronous-distributed method that penalizes stale gradients linearly to the Gap and performs well even when scaling to large numbers of workers.

Cloud Computing

Taming Momentum in a Distributed Asynchronous Environment

no code implementations26 Jul 2019 Ido Hakimi, Saar Barkai, Moshe Gabel, Assaf Schuster

We propose DANA: a novel technique for asynchronous distributed SGD with momentum that mitigates gradient staleness by computing the gradient on an estimated future position of the model's parameters.

16k Distributed Computing

DANA: Scalable Out-of-the-box Distributed ASGD Without Retuning

no code implementations ICLR 2019 Ido Hakimi, Saar Barkai, Moshe Gabel, Assaf Schuster

We propose DANA, a novel approach that scales out-of-the-box to large clusters using the same hyperparameters and learning schedule optimized for training on a single worker, while maintaining similar final accuracy without additional overhead.

Distributed Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.