1 code implementation • 4 Oct 2024 • Moran Shkolnik, Maxim Fishman, Brian Chmiel, Hilla Ben-Yaacov, Ron Banner, Kfir Yehuda Levy
The combination of accelerating both $e^x$ and $\sum(e^x)$ results in a 36. 9% acceleration in the softmax operation.
1 code implementation • NeurIPS 2023 • Niv Giladi, Shahar Gottlieb, Moran Shkolnik, Asaf Karnieli, Ron Banner, Elad Hoffer, Kfir Yehuda Levy, Daniel Soudry
Thus, these methods are limited by the delays caused by straggling workers.
no code implementations • ICLR 2021 • Brian Chmiel, Liad Ben-Uri, Moran Shkolnik, Elad Hoffer, Ron Banner, Daniel Soudry
While training can mostly be accelerated by reducing the time needed to propagate neural gradients back throughout the model, most previous works focus on the quantization/pruning of weights and activations.
1 code implementation • NeurIPS 2020 • Moran Shkolnik, Brian Chmiel, Ron Banner, Gil Shomron, Yury Nahshan, Alex Bronstein, Uri Weiser
Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed.
1 code implementation • ECCV 2020 • Gil Shomron, Ron Banner, Moran Shkolnik, Uri Weiser
Convolutional neural networks (CNNs) introduce state-of-the-art results for various tasks with the price of high computational demands.