Search Results for author: Dibakar Gope

Found 15 papers, 5 papers with code

Jumping through Local Minima: Quantization in the Loss Landscape of Vision Transformers

1 code implementation • ICCV 2023 • Natalia Frumkin, Dibakar Gope, Diana Marculescu

Evol-Q improves the top-1 accuracy of a fully quantized ViT-Base by $10. 30\%$, $0. 78\%$, and $0. 15\%$ for $3$-bit, $4$-bit, and $8$-bit weight quantization levels.

Quantization

Paper
Code

PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices

no code implementations • 26 Jan 2023 • Yuji Chai, Devashree Tripathy, Chuteng Zhou, Dibakar Gope, Igor Fedorov, Ramon Matas, David Brooks, Gu-Yeon Wei, Paul Whatmough

The ability to accurately predict deep neural network (DNN) inference performance metrics, such as latency, power, and memory footprint, for an arbitrary DNN on a target hardware platform is essential to the design of DNN based models.

Paper
Add Code

CPT-V: A Contrastive Approach to Post-Training Quantization of Vision Transformers

no code implementations • 17 Nov 2022 • Natalia Frumkin, Dibakar Gope, Diana Marculescu

Borrowing the idea of contrastive loss from self-supervised learning, we find a robust way to jointly minimize a loss function using just 1, 000 calibration images.

Quantization Self-Supervised Learning

Paper
Add Code

Restructurable Activation Networks

1 code implementation • 17 Aug 2022 • Kartikeya Bhardwaj, James Ward, Caleb Tung, Dibakar Gope, Lingchuan Meng, Igor Fedorov, Alex Chalfin, Paul Whatmough, Danny Loh

To address this question, we propose a new paradigm called Restructurable Activation Networks (RANs) that manipulate the amount of non-linearity in models to improve their hardware-awareness and efficiency.

object-detection Object Detection

Paper
Code

Super-Efficient Super Resolution for Fast Adversarial Defense at the Edge

1 code implementation • 29 Dec 2021 • Kartikeya Bhardwaj, Dibakar Gope, James Ward, Paul Whatmough, Danny Loh

Autonomous systems are highly vulnerable to a variety of adversarial attacks on Deep Neural Networks (DNNs).

Adversarial Defense Image Classification +1

Paper
Code

Collapsible Linear Blocks for Super-Efficient Super Resolution

3 code implementations • 17 Mar 2021 • Kartikeya Bhardwaj, Milos Milosavljevic, Liam O'Neil, Dibakar Gope, Ramon Matas, Alex Chalfin, Naveen Suda, Lingchuan Meng, Danny Loh

Our results highlight the challenges faced by super resolution on AI accelerators and demonstrate that SESR is significantly faster (e. g., 6x-8x higher FPS) than existing models on mobile-NPU.

4k 8k +1

275

Paper
Code

MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers

1 code implementation • 21 Oct 2020 • Colby Banbury, Chuteng Zhou, Igor Fedorov, Ramon Matas Navarro, Urmish Thakker, Dibakar Gope, Vijay Janapa Reddi, Matthew Mattina, Paul N. Whatmough

To address this challenge, neural architecture search (NAS) promises to help design accurate ML models that meet the tight MCU memory, latency and energy constraints.

Ranked #1 on Keyword Spotting on Google Speech Commands V2 12

Anomaly Detection Keyword Spotting +1

177

Paper
Code

Rank and run-time aware compression of NLP Applications

no code implementations • EMNLP (sustainlp) 2020 • Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Matthew Mattina

We evaluate the impact of this technique on 5 NLP benchmarks across multiple tasks (Translation, Intent Detection, Language Modeling) and show that for similar accuracy values and compression factors, HMF can achieve more than 2. 32x faster inference run-time than pruning and 16. 77% better accuracy than LMF.

Intent Detection Language Modelling +1

Paper
Add Code

High Throughput Matrix-Matrix Multiplication between Asymmetric Bit-Width Operands

no code implementations • 3 Aug 2020 • Dibakar Gope, Jesse Beu, Matthew Mattina

While existing SIMD matrix multiplication instructions for symmetric bit-width operands can support operands of mixed precision by zero- or sign-extending the narrow operand to match the size of the other operands, they cannot exploit the benefit of narrow bit-width of one of the operands.

BIG-bench Machine Learning Vocal Bursts Intensity Prediction

Paper
Add Code

The gem5 Simulator: Version 20.0+

no code implementations • 7 Jul 2020 • Jason Lowe-Power, Abdul Mutaal Ahmad, Ayaz Akram, Mohammad Alian, Rico Amslinger, Matteo Andreozzi, Adrià Armejach, Nils Asmussen, Brad Beckmann, Srikant Bharadwaj, Gabe Black, Gedare Bloom, Bobby R. Bruce, Daniel Rodrigues Carvalho, Jeronimo Castrillon, Lizhong Chen, Nicolas Derumigny, Stephan Diestelhorst, Wendy Elsasser, Carlos Escuin, Marjan Fariborz, Amin Farmahini-Farahani, Pouya Fotouhi, Ryan Gambord, Jayneel Gandhi, Dibakar Gope, Thomas Grass, Anthony Gutierrez, Bagus Hanindhito, Andreas Hansson, Swapnil Haria, Austin Harris, Timothy Hayes, Adrian Herrera, Matthew Horsnell, Syed Ali Raza Jafri, Radhika Jagtap, Hanhwi Jang, Reiley Jeyapaul, Timothy M. Jones, Matthias Jung, Subash Kannoth, Hamidreza Khaleghzadeh, Yuetsu Kodama, Tushar Krishna, Tommaso Marinelli, Christian Menard, Andrea Mondelli, Miquel Moreto, Tiago Mück, Omar Naji, Krishnendra Nathella, Hoa Nguyen, Nikos Nikoleris, Lena E. Olson, Marc Orr, Binh Pham, Pablo Prieto, Trivikram Reddy, Alec Roelke, Mahyar Samani, Andreas Sandberg, Javier Setoain, Boris Shingarov, Matthew D. Sinclair, Tuan Ta, Rahul Thakur, Giacomo Travaglini, Michael Upton, Nilay Vaish, Ilias Vougioukas, William Wang, Zhengrong Wang, Norbert Wehn, Christian Weis, David A. Wood, Hongil Yoon, Éder F. Zulian

The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research.

Hardware Architecture

Paper
Add Code

Ternary MobileNets via Per-Layer Hybrid Filter Banks

no code implementations • 4 Nov 2019 • Dibakar Gope, Jesse Beu, Urmish Thakker, Matthew Mattina

Using this proposed quantization method, we quantized a substantial portion of weight filters of MobileNets to ternary values resulting in 27. 98% savings in energy, and a 51. 07% reduction in the model size, while achieving comparable accuracy and no degradation in throughput on specialized hardware in comparison to the baseline full-precision MobileNets.

Quantization

Paper
Add Code

Pushing the limits of RNN Compression

no code implementations • 4 Oct 2019 • Urmish Thakker, Igor Fedorov, Jesse Beu, Dibakar Gope, Chu Zhou, Ganesh Dasika, Matthew Mattina

This paper introduces a method to compress RNNs for resource constrained environments using Kronecker product (KP).

Paper
Add Code

Run-Time Efficient RNN Compression for Inference on Edge Devices

no code implementations • 12 Jun 2019 • Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Matthew Mattina

Recurrent neural networks can be large and compute-intensive, yet many applications that benefit from RNNs run on small devices with very limited compute and storage capabilities while still having run-time constraints.

Edge-computing

Paper
Add Code

Compressing RNNs for IoT devices by 15-38x using Kronecker Products

no code implementations • 7 Jun 2019 • Urmish Thakker, Jesse Beu, Dibakar Gope, Chu Zhou, Igor Fedorov, Ganesh Dasika, Matthew Mattina

Recurrent Neural Networks (RNN) can be difficult to deploy on resource constrained devices due to their size. As a result, there is a need for compression techniques that can significantly compress RNNs without negatively impacting task accuracy.

Paper
Add Code

Ternary Hybrid Neural-Tree Networks for Highly Constrained IoT Applications

no code implementations • 4 Mar 2019 • Dibakar Gope, Ganesh Dasika, Matthew Mattina

Machine learning-based applications are increasingly prevalent in IoT devices.

Keyword Spotting Quantization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.