Search Results for author: Ron Banner

Found 21 papers, 15 papers with code

Minimum Variance Unbiased N:M Sparsity for the Neural Gradients

no code implementations21 Mar 2022 Brian Chmiel, Itay Hubara, Ron Banner, Daniel Soudry

We show that while minimization of the MSE works fine for pruning the weights and activations, it catastrophically fails for the neural gradients.

Energy awareness in low precision neural networks

no code implementations6 Feb 2022 Nurit Spingarn Eliezer, Ron Banner, Elad Hoffer, Hilla Ben-Yaakov, Tomer Michaeli

Power consumption is a major obstacle in the deployment of deep neural networks (DNNs) on end devices.

Quantization

Graph Representation Learning via Aggregation Enhancement

2 code implementations30 Jan 2022 Maxim Fishman, Chaim Baskin, Evgenii Zheltonozhskii, Almog David, Ron Banner, Avi Mendelson

Graph neural networks (GNNs) have become a powerful tool for processing graph-structured data but still face challenges in effectively aggregating and propagating information between layers, which limits their performance.

Data Augmentation Graph Representation Learning +3

Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats

no code implementations19 Dec 2021 Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben Yaacov, Daniel Soudry

Reducing the computational footprint of the entire training process requires the quantization of the neural gradients, i. e., the loss gradients with respect to the outputs of intermediate neural layers.

Quantization

Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning

no code implementations29 Sep 2021 Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben Yaacov, Daniel Soudry

Based on this, we suggest a logarithmic unbiased quantization (LUQ) method to quantize both the forward and backward phase to 4-bit, achieving state-of-the-art results in 4-bit training.

Quantization

Beyond Quantization: Power aware neural networks

no code implementations29 Sep 2021 Nurit Spingarn, Elad Hoffer, Ron Banner, Hilla Ben Yaacov, Tomer Michaeli

Power consumption is a major obstacle in the deployment of deep neural networks (DNNs) on end devices.

Quantization

Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

1 code implementation NeurIPS 2021 Itay Hubara, Brian Chmiel, Moshe Island, Ron Banner, Seffi Naor, Daniel Soudry

Finally, to solve the problem of switching between different structure constraints, we suggest a method to convert a pre-trained model with unstructured sparsity to an N:M fine-grained block sparsity model with little to no training.

GAN "Steerability" without optimization

1 code implementation ICLR 2021 Nurit Spingarn-Eliezer, Ron Banner, Tomer Michaeli

However, all existing techniques rely on an optimization procedure to expose those directions, and offer no control over the degree of allowed interaction between different transformations.

Neural gradients are near-lognormal: improved quantized and sparse training

no code implementations ICLR 2021 Brian Chmiel, Liad Ben-Uri, Moran Shkolnik, Elad Hoffer, Ron Banner, Daniel Soudry

While training can mostly be accelerated by reducing the time needed to propagate neural gradients back throughout the model, most previous works focus on the quantization/pruning of weights and activations.

Neural Network Compression Quantization

Robust Quantization: One Model to Rule Them All

1 code implementation NeurIPS 2020 Moran Shkolnik, Brian Chmiel, Ron Banner, Gil Shomron, Yury Nahshan, Alex Bronstein, Uri Weiser

Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed.

Quantization

Post training 4-bit quantization of convolutional networks for rapid-deployment

1 code implementation NeurIPS 2019 Ron Banner, Yury Nahshan, Daniel Soudry

Convolutional neural networks require significant memory bandwidth and storage for intermediate computations, apart from substantial computing resources.

Quantization

Loss Aware Post-training Quantization

2 code implementations17 Nov 2019 Yury Nahshan, Brian Chmiel, Chaim Baskin, Evgenii Zheltonozhskii, Ron Banner, Alex M. Bronstein, Avi Mendelson

We show that with more aggressive quantization, the loss landscape becomes highly non-separable with steep curvature, making the selection of quantization parameters more challenging.

Quantization

CAT: Compression-Aware Training for bandwidth reduction

1 code implementation25 Sep 2019 Chaim Baskin, Brian Chmiel, Evgenii Zheltonozhskii, Ron Banner, Alex M. Bronstein, Avi Mendelson

Our method trains the model to achieve low-entropy feature maps, which enables efficient compression at inference time using classical transform coding methods.

Quantization

Thanks for Nothing: Predicting Zero-Valued Activations with Lightweight Convolutional Neural Networks

1 code implementation ECCV 2020 Gil Shomron, Ron Banner, Moran Shkolnik, Uri Weiser

Convolutional neural networks (CNNs) introduce state-of-the-art results for various tasks with the price of high computational demands.

Feature Map Transform Coding for Energy-Efficient CNN Inference

1 code implementation26 May 2019 Brian Chmiel, Chaim Baskin, Ron Banner, Evgenii Zheltonozhskii, Yevgeny Yermolin, Alex Karbachevsky, Alex M. Bronstein, Avi Mendelson

We analyze the performance of our approach on a variety of CNN architectures and demonstrate that FPGA implementation of ResNet-18 with our approach results in a reduction of around 40% in the memory energy footprint, compared to quantized network, with negligible impact on accuracy.

Video Compression

ACIQ: Analytical Clipping for Integer Quantization of neural networks

1 code implementation ICLR 2019 Ron Banner, Yury Nahshan, Elad Hoffer, Daniel Soudry

We analyze the trade-off between quantization noise and clipping distortion in low precision networks.

Quantization

Post-training 4-bit quantization of convolution networks for rapid-deployment

2 code implementations2 Oct 2018 Ron Banner, Yury Nahshan, Elad Hoffer, Daniel Soudry

Convolutional neural networks require significant memory bandwidth and storage for intermediate computations, apart from substantial computing resources.

Quantization

Scalable Methods for 8-bit Training of Neural Networks

3 code implementations NeurIPS 2018 Ron Banner, Itay Hubara, Elad Hoffer, Daniel Soudry

Armed with this knowledge, we quantize the model parameters, activations and layer gradients to 8-bit, leaving at a higher precision only the final step in the computation of the weight gradients.

Quantization

Norm matters: efficient and accurate normalization schemes in deep networks

4 code implementations NeurIPS 2018 Elad Hoffer, Ron Banner, Itay Golan, Daniel Soudry

Over the past few years, Batch-Normalization has been commonly used in deep networks, allowing faster training and high performance for a wide variety of applications.

Cannot find the paper you are looking for? You can Submit a new open access paper.