1 code implementation • NeurIPS 2021 • Gil Shomron, Freddy Gabbay, Samer Kurzum, Uri Weiser
Moreover, instead of quantizing activation-by-activation to 4 bits, we focus on pairs of 8-bit activations and examine whether one of the two is equal to zero.
no code implementations • 12 Oct 2020 • Gil Shomron, Uri Weiser
However, the method of accommodating more than one thread in a shared MAC unit may contribute noise to the computations, thereby changing the internal statistics of the model.
no code implementations • 17 Apr 2020 • Gil Shomron, Uri Weiser
Inspired by conventional CPU simultaneous multithreading (SMT) that increases computer resource utilization by sharing them across several threads, we propose non-blocking SMT (NB-SMT) designated for DNN accelerators.
1 code implementation • NeurIPS 2020 • Moran Shkolnik, Brian Chmiel, Ron Banner, Gil Shomron, Yury Nahshan, Alex Bronstein, Uri Weiser
Neural network quantization methods often involve simulating the quantization process during training, making the trained model highly dependent on the target bit-width and precise way quantization is performed.
1 code implementation • ECCV 2020 • Gil Shomron, Ron Banner, Moran Shkolnik, Uri Weiser
Convolutional neural networks (CNNs) introduce state-of-the-art results for various tasks with the price of high computational demands.
no code implementations • 21 Jul 2018 • Gil Shomron, Uri Weiser
Convolutional neural networks (CNNs) are a widely used form of deep neural networks, introducing state-of-the-art results for different problems such as image classification, computer vision tasks, and speech recognition.