1 code implementation • 23 Aug 2023 • Hadi Esmaeilzadeh, Soroush Ghodrati, Andrew B. Kahng, Joon Kyung Kim, Sean Kinzer, Sayak Kundu, Rohan Mahapatra, Susmita Dey Manasi, Sachin Sapatnekar, Zhiang Wang, Ziqing Zeng
Parameterizable machine learning (ML) accelerators are the product of recent breakthroughs in ML.
no code implementations • 29 Jun 2023 • Hadi Esmaeilzadeh, Soroush Ghodrati, Andrew B. Kahng, Sean Kinzer, Susmita Dey Manasi, Sachin S. Sapatnekar, Zhiang Wang
The modeling effort of SimDIT comprehensively covers convolution and non-convolution operations of both CNN inference and training on a highly parameterizable hardware substrate.
no code implementations • 7 Apr 2022 • Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
To best utilize this mathematical innovation, we devise a bit-serial architecture, dubbed LeOPArd, for transformer language models with bit-level early termination microarchitectural mechanism.
no code implementations • 12 May 2021 • Hadi Esmaeilzadeh, Reza Vaezi
Recent advances in artificial intelligence (AI) have achieved human-scale speed and accuracy for classification tasks.
1 code implementation • 1 Jan 2021 • Ahmed T. Elthakeb, Prannoy Pilligundla, Tarek Elgindi, FatemehSadat Mireshghallah, Charles-Alban Deledalle, Hadi Esmaeilzadeh
We show how WaveQ balance compute efficiency and accuracy, and provide a heterogeneous bitwidth assignment for quantization of a large variety of deep networks (AlexNet, CIFAR-10, MobileNet, ResNet-18, ResNet-20, SVHN, and VGG-11) that virtually preserves the accuracy.
no code implementations • 25 Apr 2020 • Fatemehsadat Mireshghallah, Mohammadkazem Taram, Praneeth Vepakomma, Abhishek Singh, Ramesh Raskar, Hadi Esmaeilzadeh
In this survey, we review the privacy concerns brought by deep learning, and the mitigating techniques introduced to tackle these issues.
no code implementations • 11 Apr 2020 • Soroush Ghodrati, Hardik Sharma, Cliff Young, Nam Sung Kim, Hadi Esmaeilzadeh
This paper explores a different design style, where each unit is only responsible for a slice of the bit-level operations to interleave and combine the benefits of bit-level parallelism with the abundant data-level parallelism in deep neural networks.
no code implementations • 26 Mar 2020 • Fatemehsadat Mireshghallah, Mohammadkazem Taram, Ali Jalali, Ahmed Taha Elthakeb, Dean Tullsen, Hadi Esmaeilzadeh
We formulate this problem as a gradient-based perturbation maximization method that discovers this subset in the input feature space with respect to the functionality of the prediction model used by the provider.
no code implementations • 4 Mar 2020 • Byung Hoon Ahn, Jinwon Lee, Jamie Menjay Lin, Hsin-Pai Cheng, Jilei Hou, Hadi Esmaeilzadeh
To address this standing issue, we present a memory-aware compiler, dubbed SERENITY, that utilizes dynamic programming to find a sequence that finds a schedule with optimal memory footprint.
no code implementations • 29 Feb 2020 • Ahmed T. Elthakeb, Prannoy Pilligundla, FatemehSadat Mireshghallah, Tarek Elgindi, Charles-Alban Deledalle, Hadi Esmaeilzadeh
We show how SINAREQ balance compute efficiency and accuracy, and provide a heterogeneous bitwidth assignment for quantization of a large variety of deep networks (AlexNet, CIFAR-10, MobileNet, ResNet-18, ResNet-20, SVHN, and VGG-11) that virtually preserves the accuracy.
1 code implementation • ICLR 2020 • Byung Hoon Ahn, Prannoy Pilligundla, Amir Yazdanbakhsh, Hadi Esmaeilzadeh
This solution dubbed Chameleon leverages reinforcement learning whose solution takes fewer steps to converge, and develops an adaptive sampling algorithm that not only focuses on the costly samples (real hardware measurements) on representative points but also uses a domain-knowledge inspired logic to improve the samples itself.
no code implementations • 27 Jun 2019 • Soroush Ghodrati, Hardik Sharma, Sean Kinzer, Amir Yazdanbakhsh, Kambiz Samadi, Nam Sung Kim, Doug Burger, Hadi Esmaeilzadeh
Low-power potential of mixed-signal design makes it an alluring option to accelerate Deep Neural Networks (DNNs).
Hardware Architecture
no code implementations • ICML 2020 • Ahmed T. Elthakeb, Prannoy Pilligundla, Alex Cloninger, Hadi Esmaeilzadeh
The deep layers of modern neural networks extract a rather rich set of features as an input propagates through the network.
1 code implementation • 30 May 2019 • Byung Hoon Ahn, Prannoy Pilligundla, Hadi Esmaeilzadeh
Further experiments also confirm that our adaptive sampling can even improve AutoTVM's simulated annealing by 4. 00x.
3 code implementations • 26 May 2019 • Fatemehsadat Mireshghallah, Mohammadkazem Taram, Prakash Ramrakhyani, Dean Tullsen, Hadi Esmaeilzadeh
To address this challenge, this paper devises Shredder, an end-to-end framework, that, without altering the topology or the weights of a pre-trained network, learns additive noise distributions that significantly reduce the information content of communicated data while maintaining the inference accuracy.
no code implementations • 4 May 2019 • Ahmed T. Elthakeb, Prannoy Pilligundla, Hadi Esmaeilzadeh
To further mitigate this loss, we propose a novel sinusoidal regularization, called SinReQ1, for deep quantized training.
no code implementations • 5 Nov 2018 • Ahmed T. Elthakeb, Prannoy Pilligundla, FatemehSadat Mireshghallah, Amir Yazdanbakhsh, Hadi Esmaeilzadeh
We show how ReLeQ can balance speed and quality, and provide an asymmetric general solution for quantization of a large variety of deep networks (AlexNet, CIFAR-10, LeNet, MobileNet-V1, ResNet-20, SVHN, and VGG-11) that virtually preserves the accuracy (=< 0. 3% loss) while minimizing the computation and storage cost.
no code implementations • 10 May 2018 • Amir Yazdanbakhsh, Hajar Falahati, Philip J. Wolfe, Kambiz Samadi, Nam Sung Kim, Hadi Esmaeilzadeh
Even though there is a convolution stage in this operator, the inserted zeros lead to underutilization of the compute resources when a conventional convolution accelerator is employed.
no code implementations • 8 Jan 2018 • Divya Mahajan, Joon Kyung Kim, Jacob Sacks, Adel Ardalan, Arun Kumar, Hadi Esmaeilzadeh
The data revolution is fueled by advances in machine learning, databases, and hardware design.
no code implementations • 5 Dec 2017 • Hardik Sharma, Jongse Park, Naveen Suda, Liangzhen Lai, Benson Chau, Joon Kyung Kim, Vikas Chandra, Hadi Esmaeilzadeh
Compared to Stripes, BitFusion provides 2. 6x speedup and 3. 9x energy reduction at 45 nm node when BitFusion area and frequency are set to those of Stripes.