Search Results for author: Ganesh Dasika

Found 6 papers, 0 papers with code

Rank and run-time aware compression of NLP Applications

no code implementations • EMNLP (sustainlp) 2020 • Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Matthew Mattina

We evaluate the impact of this technique on 5 NLP benchmarks across multiple tasks (Translation, Intent Detection, Language Modeling) and show that for similar accuracy values and compression factors, HMF can achieve more than 2. 32x faster inference run-time than pruning and 16. 77% better accuracy than LMF.

Intent Detection Language Modelling +1

Paper
Add Code

Pushing the limits of RNN Compression

no code implementations • 4 Oct 2019 • Urmish Thakker, Igor Fedorov, Jesse Beu, Dibakar Gope, Chu Zhou, Ganesh Dasika, Matthew Mattina

This paper introduces a method to compress RNNs for resource constrained environments using Kronecker product (KP).

Paper
Add Code

Run-Time Efficient RNN Compression for Inference on Edge Devices

no code implementations • 12 Jun 2019 • Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Matthew Mattina

Recurrent neural networks can be large and compute-intensive, yet many applications that benefit from RNNs run on small devices with very limited compute and storage capabilities while still having run-time constraints.

Edge-computing

Paper
Add Code

Compressing RNNs for IoT devices by 15-38x using Kronecker Products

no code implementations • 7 Jun 2019 • Urmish Thakker, Jesse Beu, Dibakar Gope, Chu Zhou, Igor Fedorov, Ganesh Dasika, Matthew Mattina

Recurrent Neural Networks (RNN) can be difficult to deploy on resource constrained devices due to their size. As a result, there is a need for compression techniques that can significantly compress RNNs without negatively impacting task accuracy.

Paper
Add Code

Ternary Hybrid Neural-Tree Networks for Highly Constrained IoT Applications

no code implementations • 4 Mar 2019 • Dibakar Gope, Ganesh Dasika, Matthew Mattina

Machine learning-based applications are increasingly prevalent in IoT devices.

Keyword Spotting Quantization

Paper
Add Code

Efficient Winograd or Cook-Toom Convolution Kernel Implementation on Widely Used Mobile CPUs

no code implementations • 4 Mar 2019 • Partha Maji, Andrew Mundy, Ganesh Dasika, Jesse Beu, Matthew Mattina, Robert Mullins

The Winograd or Cook-Toom class of algorithms help to reduce the overall compute complexity of many modern deep convolutional neural networks (CNNs).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.