Search Results for author: Brian Chmiel

Found 13 papers, 9 papers with code

Bimodal Distributed Binarized Neural Networks

1 code implementation5 Apr 2022 Tal Rozen, Moshe Kimhi, Brian Chmiel, Avi Mendelson, Chaim Baskin

The proposed method consists of a training scheme that we call Weight Distribution Mimicking (WDM), which efficiently imitates the full-precision network weight distribution to their binary counterpart.

Binarization Quantization

Optimal Fine-Grained N:M sparsity for Activations and Neural Gradients

1 code implementation21 Mar 2022 Brian Chmiel, Itay Hubara, Ron Banner, Daniel Soudry

We show that while minimization of the MSE works fine for pruning the activations, it catastrophically fails for the neural gradients.

Logarithmic Unbiased Quantization: Simple 4-bit Training in Deep Learning

no code implementations19 Dec 2021 Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben Yaacov, Daniel Soudry

Based on this, we suggest a \textit{logarithmic unbiased quantization} (LUQ) method to quantize all both the forward and backward phase to 4-bit, achieving state-of-the-art results in 4-bit training without overhead.

Quantization

Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning

no code implementations29 Sep 2021 Brian Chmiel, Ron Banner, Elad Hoffer, Hilla Ben Yaacov, Daniel Soudry

Based on this, we suggest a logarithmic unbiased quantization (LUQ) method to quantize both the forward and backward phase to 4-bit, achieving state-of-the-art results in 4-bit training.

Quantization

Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

1 code implementation NeurIPS 2021 Itay Hubara, Brian Chmiel, Moshe Island, Ron Banner, Seffi Naor, Daniel Soudry

Finally, to solve the problem of switching between different structure constraints, we suggest a method to convert a pre-trained model with unstructured sparsity to an N:M fine-grained block sparsity model with little to no training.

Neural gradients are near-lognormal: improved quantized and sparse training

no code implementations ICLR 2021 Brian Chmiel, Liad Ben-Uri, Moran Shkolnik, Elad Hoffer, Ron Banner, Daniel Soudry

While training can mostly be accelerated by reducing the time needed to propagate neural gradients back throughout the model, most previous works focus on the quantization/pruning of weights and activations.

Neural Network Compression Quantization

Colored Noise Injection for Training Adversarially Robust Neural Networks

no code implementations4 Mar 2020 Evgenii Zheltonozhskii, Chaim Baskin, Yaniv Nemcovsky, Brian Chmiel, Avi Mendelson, Alex M. Bronstein

Even though deep learning has shown unmatched performance on various tasks, neural networks have been shown to be vulnerable to small adversarial perturbations of the input that lead to significant performance degradation.

Robust Quantization: One Model to Rule Them All

1 code implementation NeurIPS 2020 Moran Shkolnik, Brian Chmiel, Ron Banner, Gil Shomron, Yury Nahshan, Alex Bronstein, Uri Weiser

Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed.

Quantization

Loss Aware Post-training Quantization

2 code implementations17 Nov 2019 Yury Nahshan, Brian Chmiel, Chaim Baskin, Evgenii Zheltonozhskii, Ron Banner, Alex M. Bronstein, Avi Mendelson

We show that with more aggressive quantization, the loss landscape becomes highly non-separable with steep curvature, making the selection of quantization parameters more challenging.

Quantization

Smoothed Inference for Adversarially-Trained Models

2 code implementations17 Nov 2019 Yaniv Nemcovsky, Evgenii Zheltonozhskii, Chaim Baskin, Brian Chmiel, Maxim Fishman, Alex M. Bronstein, Avi Mendelson

In this work, we study the application of randomized smoothing as a way to improve performance on unperturbed data as well as to increase robustness to adversarial attacks.

Adversarial Defense

CAT: Compression-Aware Training for bandwidth reduction

1 code implementation25 Sep 2019 Chaim Baskin, Brian Chmiel, Evgenii Zheltonozhskii, Ron Banner, Alex M. Bronstein, Avi Mendelson

Our method trains the model to achieve low-entropy feature maps, which enables efficient compression at inference time using classical transform coding methods.

Quantization

Feature Map Transform Coding for Energy-Efficient CNN Inference

1 code implementation26 May 2019 Brian Chmiel, Chaim Baskin, Ron Banner, Evgenii Zheltonozhskii, Yevgeny Yermolin, Alex Karbachevsky, Alex M. Bronstein, Avi Mendelson

We analyze the performance of our approach on a variety of CNN architectures and demonstrate that FPGA implementation of ResNet-18 with our approach results in a reduction of around 40% in the memory energy footprint, compared to quantized network, with negligible impact on accuracy.

Video Compression

Towards Learning of Filter-Level Heterogeneous Compression of Convolutional Neural Networks

2 code implementations22 Apr 2019 Yochai Zur, Chaim Baskin, Evgenii Zheltonozhskii, Brian Chmiel, Itay Evron, Alex M. Bronstein, Avi Mendelson

While mainstream deep learning methods train the neural networks weights while keeping the network architecture fixed, the emerging neural architecture search (NAS) techniques make the latter also amenable to training.

Network Pruning Neural Architecture Search +1

Cannot find the paper you are looking for? You can Submit a new open access paper.