Search Results for author: Assaf Schuster

Found 11 papers, 2 papers with code

FOSI: Hybrid First and Second Order Optimization

1 code implementation • 16 Feb 2023 • Hadar Sivan, Moshe Gabel, Assaf Schuster

Popular machine learning approaches forgo second-order information due to the difficulty of computing curvature in high dimensions.

Audio Classification Language Modelling +2

Paper
Code

Unsupervised Frequent Pattern Mining for CEP

no code implementations • 28 Jul 2022 • Guy Shapira, Assaf Schuster

We present REDEEMER (REinforcement baseD cEp pattErn MinER), a novel reinforcement and active learning approach aimed at mining CEP patterns that allow expansion of the knowledge extracted while reducing the human effort required.

Active Learning Descriptive +1

Paper
Add Code

Coin Flipping Neural Networks

no code implementations • 18 Jun 2022 • Yuval Sieradzki, Nitzan Hodos, Gal Yehuda, Assaf Schuster

We show that a CFNN can approximate the indicator of a $d$-dimensional ball to arbitrary accuracy with only 2 layers and $\mathcal{O}(1)$ neurons, where a 2-layer deterministic network was shown to require $\Omega(e^d)$ neurons, an exponential improvement (arXiv:1610. 09887).

Paper
Add Code

Fine-tuning giant neural networks on commodity hardware with automatic pipeline model parallelism

1 code implementation • USENIX Annual Technical Conference 2021 • Saar Eliad, Ido Hakimi, Alon De Jager, Mark Silberstein, Assaf Schuster

Fine-tuning is an increasingly common technique that leverages transfer learning to dramatically expedite the training of huge, high-quality models.

Transfer Learning

Paper
Code

Learning Under Delayed Feedback: Implicitly Adapting to Gradient Delays

no code implementations • 23 Jun 2021 • Rotem Zamir Aviv, Ido Hakimi, Assaf Schuster, Kfir Y. Levy

We consider stochastic convex optimization problems, where several machines act asynchronously in parallel while sharing a common memory.

Paper
Add Code

Gap-Aware Mitigation of Gradient Staleness

no code implementations • ICLR 2020 • Saar Barkai, Ido Hakimi, Assaf Schuster

In this paper we define the Gap as a measure of gradient staleness and propose Gap-Aware (GA), a novel asynchronous-distributed method that penalizes stale gradients linearly to the Gap and performs well even when scaling to large numbers of workers.

Cloud Computing

Paper
Add Code

It's Not What Machines Can Learn, It's What We Cannot Teach

no code implementations • ICML 2020 • Gal Yehuda, Moshe Gabel, Assaf Schuster

Can deep neural networks learn to solve any task, and in particular problems of high complexity?

Traveling Salesman Problem

Paper
Add Code

Adaptive Communication Bounds for Distributed Online Learning

no code implementations • 28 Nov 2019 • Michael Kamp, Mario Boley, Michael Mock, Daniel Keren, Assaf Schuster, Izchak Sharfman

The learning performance of such a protocol is intuitively optimal if approximately the same loss is incurred as in a hypothetical serial setting.

Paper
Add Code

Gap Aware Mitigation of Gradient Staleness

no code implementations • 24 Sep 2019 • Saar Barkai, Ido Hakimi, Assaf Schuster

Cloud Computing

Paper
Add Code

Taming Momentum in a Distributed Asynchronous Environment

no code implementations • 26 Jul 2019 • Ido Hakimi, Saar Barkai, Moshe Gabel, Assaf Schuster

We propose DANA: a novel technique for asynchronous distributed SGD with momentum that mitigates gradient staleness by computing the gradient on an estimated future position of the model's parameters.

16k Distributed Computing

Paper
Add Code

DANA: Scalable Out-of-the-box Distributed ASGD Without Retuning

no code implementations • ICLR 2019 • Ido Hakimi, Saar Barkai, Moshe Gabel, Assaf Schuster

We propose DANA, a novel approach that scales out-of-the-box to large clusters using the same hyperparameters and learning schedule optimized for training on a single worker, while maintaining similar final accuracy without additional overhead.

Distributed Computing

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.