Search Results for author: Dibakar Gope

Found 11 papers, 2 papers with code

Collapsible Linear Blocks for Super-Efficient Super Resolution

1 code implementation17 Mar 2021 Kartikeya Bhardwaj, Milos Milosavljevic, Liam O'Neil, Dibakar Gope, Ramon Matas, Alex Chalfin, Naveen Suda, Lingchuan Meng, Danny Loh

Our results highlight the challenges faced by super resolution on AI accelerators and demonstrate that SESR is significantly faster (e. g., 6x-8x higher FPS) than existing models on mobile-NPU.

Image Super-Resolution

Rank and run-time aware compression of NLP Applications

no code implementations EMNLP (sustainlp) 2020 Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Matthew Mattina

We evaluate the impact of this technique on 5 NLP benchmarks across multiple tasks (Translation, Intent Detection, Language Modeling) and show that for similar accuracy values and compression factors, HMF can achieve more than 2. 32x faster inference run-time than pruning and 16. 77% better accuracy than LMF.

Intent Detection Language Modelling +1

High Throughput Matrix-Matrix Multiplication between Asymmetric Bit-Width Operands

no code implementations3 Aug 2020 Dibakar Gope, Jesse Beu, Matthew Mattina

While existing SIMD matrix multiplication instructions for symmetric bit-width operands can support operands of mixed precision by zero- or sign-extending the narrow operand to match the size of the other operands, they cannot exploit the benefit of narrow bit-width of one of the operands.

Ternary MobileNets via Per-Layer Hybrid Filter Banks

no code implementations4 Nov 2019 Dibakar Gope, Jesse Beu, Urmish Thakker, Matthew Mattina

Using this proposed quantization method, we quantized a substantial portion of weight filters of MobileNets to ternary values resulting in 27. 98% savings in energy, and a 51. 07% reduction in the model size, while achieving comparable accuracy and no degradation in throughput on specialized hardware in comparison to the baseline full-precision MobileNets.

Quantization

Pushing the limits of RNN Compression

no code implementations4 Oct 2019 Urmish Thakker, Igor Fedorov, Jesse Beu, Dibakar Gope, Chu Zhou, Ganesh Dasika, Matthew Mattina

This paper introduces a method to compress RNNs for resource constrained environments using Kronecker product (KP).

Run-Time Efficient RNN Compression for Inference on Edge Devices

no code implementations12 Jun 2019 Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Matthew Mattina

Recurrent neural networks can be large and compute-intensive, yet many applications that benefit from RNNs run on small devices with very limited compute and storage capabilities while still having run-time constraints.

Edge-computing

Compressing RNNs for IoT devices by 15-38x using Kronecker Products

no code implementations7 Jun 2019 Urmish Thakker, Jesse Beu, Dibakar Gope, Chu Zhou, Igor Fedorov, Ganesh Dasika, Matthew Mattina

Recurrent Neural Networks (RNN) can be difficult to deploy on resource constrained devices due to their size. As a result, there is a need for compression techniques that can significantly compress RNNs without negatively impacting task accuracy.

Cannot find the paper you are looking for? You can Submit a new open access paper.