Search Results for author: Elad Hoffer

Found 27 papers, 17 papers with code

DropCompute: simple and more robust distributed synchronous training via compute variance reduction

1 code implementation • NeurIPS 2023 • Niv Giladi, Shahar Gottlieb, Moran Shkolnik, Asaf Karnieli, Ron Banner, Elad Hoffer, Kfir Yehuda Levy, Daniel Soudry

Thus, these methods are limited by the delays caused by straggling workers.

Paper
Code

Energy awareness in low precision neural networks

no code implementations • 6 Feb 2022 • Nurit Spingarn Eliezer, Ron Banner, Elad Hoffer, Hilla Ben-Yaakov, Tomer Michaeli

Power consumption is a major obstacle in the deployment of deep neural networks (DNNs) on end devices.

Quantization

Paper
Add Code

Logarithmic Unbiased Quantization: Simple 4-bit Training in Deep Learning

no code implementations • 19 Dec 2021 • Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben Yaacov, Daniel Soudry

Based on this, we suggest a \textit{logarithmic unbiased quantization} (LUQ) method to quantize all both the forward and backward phase to 4-bit, achieving state-of-the-art results in 4-bit training without overhead.

Quantization

Paper
Add Code

Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning

no code implementations • 29 Sep 2021 • Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben Yaacov, Daniel Soudry

Based on this, we suggest a logarithmic unbiased quantization (LUQ) method to quantize both the forward and backward phase to 4-bit, achieving state-of-the-art results in 4-bit training.

Quantization

Paper
Add Code

Beyond Quantization: Power aware neural networks

no code implementations • 29 Sep 2021 • Nurit Spingarn, Elad Hoffer, Ron Banner, Hilla Ben Yaacov, Tomer Michaeli

Power consumption is a major obstacle in the deployment of deep neural networks (DNNs) on end devices.

Quantization

Paper
Add Code

MixSize: Training Convnets With Mixed Image Sizes for Improved Accuracy, Speed and Scale Resiliency

2 code implementations • 1 Jan 2021 • Elad Hoffer, Berry Weinstein, Itay Hubara, Tal Ben-Nun, Torsten Hoefler, Daniel Soudry

Although trained on images of a specific size, it is well established that CNNs can be used to evaluate a wide range of image sizes at test time, by adjusting the size of intermediate feature maps.

Paper
Code

Task Agnostic Continual Learning Using Online Variational Bayes with Fixed-Point Updates

1 code implementation • 1 Oct 2020 • Chen Zeno, Itay Golan, Elad Hoffer, Daniel Soudry

The optimal Bayesian solution for this requires an intractable online Bayes update to the weights posterior.

Continual Learning

Paper
Code

Neural gradients are near-lognormal: improved quantized and sparse training

no code implementations • ICLR 2021 • Brian Chmiel, Liad Ben-Uri, Moran Shkolnik, Elad Hoffer, Ron Banner, Daniel Soudry

While training can mostly be accelerated by reducing the time needed to propagate neural gradients back throughout the model, most previous works focus on the quantization/pruning of weights and activations.

Neural Network Compression Quantization

Paper
Add Code

Augment Your Batch: Improving Generalization Through Instance Repetition

no code implementations • CVPR 2020 • Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler, Daniel Soudry

Large-batch SGD is important for scaling training of deep neural networks.

Paper
Add Code

The Knowledge Within: Methods for Data-Free Model Compression

no code implementations • CVPR 2020 • Matan Haroush, Itay Hubara, Elad Hoffer, Daniel Soudry

Then, we demonstrate how these samples can be used to calibrate and fine-tune quantized models without using any real data in the process.

Model Compression

Paper
Add Code

At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?

1 code implementation • ICLR 2020 • Niv Giladi, Mor Shpigel Nacson, Elad Hoffer, Daniel Soudry

However, asynchronous training has its pitfalls, mainly a degradation in generalization, even after convergence of the algorithm.

Paper
Code

Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency

2 code implementations • 12 Aug 2019 • Elad Hoffer, Berry Weinstein, Itay Hubara, Tal Ben-Nun, Torsten Hoefler, Daniel Soudry

Although trained on images of aspecific size, it is well established that CNNs can be used to evaluate a wide range of image sizes at test time, by adjusting the size of intermediate feature maps.

344

Paper
Code

ACIQ: Analytical Clipping for Integer Quantization of neural networks

1 code implementation • ICLR 2019 • Ron Banner, Yury Nahshan, Elad Hoffer, Daniel Soudry

We analyze the trade-off between quantization noise and clipping distortion in low precision networks.

Quantization

Paper
Code

Augment your batch: better training with larger batches

1 code implementation • 27 Jan 2019 • Elad Hoffer, Tal Ben-Nun, Itay Hubara, Niv Giladi, Torsten Hoefler, Daniel Soudry

We analyze the effect of batch augmentation on gradient variance and show that it empirically improves convergence for a wide variety of deep neural networks and datasets.

Paper
Code

Post-training 4-bit quantization of convolution networks for rapid-deployment

2 code implementations • 2 Oct 2018 • Ron Banner, Yury Nahshan, Elad Hoffer, Daniel Soudry

Convolutional neural networks require significant memory bandwidth and storage for intermediate computations, apart from substantial computing resources.

Quantization

233

Paper
Code

Scalable Methods for 8-bit Training of Neural Networks

3 code implementations • NeurIPS 2018 • Ron Banner, Itay Hubara, Elad Hoffer, Daniel Soudry

Armed with this knowledge, we quantize the model parameters, activations and layer gradients to 8-bit, leaving at a higher precision only the final step in the computation of the weight gradients.

Quantization

344

Paper
Code

Task Agnostic Continual Learning Using Online Variational Bayes

2 code implementations • 27 Mar 2018 • Chen Zeno, Itay Golan, Elad Hoffer, Daniel Soudry

However, research for scenarios in which task boundaries are unknown during training has been lacking.

Continual Learning

Paper
Code

Norm matters: efficient and accurate normalization schemes in deep networks

4 code implementations • NeurIPS 2018 • Elad Hoffer, Ron Banner, Itay Golan, Daniel Soudry

Over the past few years, Batch-Normalization has been commonly used in deep networks, allowing faster training and high performance for a wide variety of applications.

344

Paper
Code

On the Blindspots of Convolutional Networks

no code implementations • 14 Feb 2018 • Elad Hoffer, Shai Fine, Daniel Soudry

Deep convolutional network has been the state-of-the-art approach for a wide variety of tasks over the last few years.

Paper
Add Code

Fix your classifier: the marginal value of training the last weight layer

3 code implementations • ICLR 2018 • Elad Hoffer, Itay Hubara, Daniel Soudry

Neural networks are commonly used as models for classification for a wide variety of tasks.

General Classification

344

Paper
Code

The Implicit Bias of Gradient Descent on Separable Data

2 code implementations • ICLR 2018 • Daniel Soudry, Elad Hoffer, Mor Shpigel Nacson, Suriya Gunasekar, Nathan Srebro

We examine gradient descent on unregularized logistic regression problems, with homogeneous linear predictors on linearly separable datasets.

Paper
Code

Train longer, generalize better: closing the generalization gap in large batch training of neural networks

1 code implementation • NeurIPS 2017 • Elad Hoffer, Itay Hubara, Daniel Soudry

Following this hypothesis we conducted experiments to show empirically that the "generalization gap" stems from the relatively small number of updates rather than the batch size, and can be completely eliminated by adapting the training regime used.

148

Paper
Code

Exponentially vanishing sub-optimal local minima in multilayer neural networks

1 code implementation • ICLR 2018 • Daniel Soudry, Elad Hoffer

We prove that, with high probability in the limit of $N\rightarrow\infty$ datapoints, the volume of differentiable regions of the empiric loss containing sub-optimal differentiable local minima is exponentially vanishing in comparison with the same volume of global minima, given standard normal input of dimension $d_{0}=\tilde{\Omega}\left(\sqrt{N}\right)$, and a more realistic number of $d_{1}=\tilde{\Omega}\left(N/d_{0}\right)$ hidden units.

Binary Classification

Paper
Code

Spatial contrasting for deep unsupervised learning

no code implementations • 21 Nov 2016 • Elad Hoffer, Itay Hubara, Nir Ailon

Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks.

Paper
Add Code

Semi-supervised deep learning by metric embedding

1 code implementation • 4 Nov 2016 • Elad Hoffer, Nir Ailon

Deep networks are successfully used as classification models yielding state-of-the-art results when trained on a large number of labeled samples.

General Classification

Paper
Code

Deep unsupervised learning through spatial contrasting

no code implementations • 2 Oct 2016 • Elad Hoffer, Itay Hubara, Nir Ailon

Convolutional networks have marked their place over the last few years as the best performing model for various visual tasks.

Paper
Add Code

Deep metric learning using Triplet network

3 code implementations • 20 Dec 2014 • Elad Hoffer, Nir Ailon

Deep learning has proven itself as a successful set of models for learning useful semantic representations of data.

General Classification Information Retrieval +2

188

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.