Search Results for author: Urmish Thakker

In this position paper, we present the current landscape of TinyML and discuss the challenges and direction towards developing a fair and useful hardware benchmark for TinyML workloads.

Benchmarking Position

310

Paper
Code

MLPerf Tiny Benchmark

2 code implementations • 14 Jun 2021 • Colby Banbury, Vijay Janapa Reddi, Peter Torelli, Jeremy Holleman, Nat Jeffries, Csaba Kiraly, Pietro Montino, David Kanter, Sebastian Ahmed, Danilo Pau, Urmish Thakker, Antonio Torrini, Peter Warden, Jay Cordaro, Giuseppe Di Guglielmo, Javier Duarte, Stephen Gibellini, Videet Parekh, Honson Tran, Nhan Tran, Niu Wenxu, Xu Xuesong

Advancements in ultra-low-power tiny machine learning (TinyML) systems promise to unlock an entirely new class of smart applications.

Anomaly Detection BIG-bench Machine Learning +2

310

Paper
Code

MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers

1 code implementation • 21 Oct 2020 • Colby Banbury, Chuteng Zhou, Igor Fedorov, Ramon Matas Navarro, Urmish Thakker, Dibakar Gope, Vijay Janapa Reddi, Matthew Mattina, Paul N. Whatmough

To address this challenge, neural architecture search (NAS) promises to help design accurate ML models that meet the tight MCU memory, latency and energy constraints.

Ranked #1 on Keyword Spotting on Google Speech Commands V2 12

Anomaly Detection Keyword Spotting +1

177

Paper
Code

Compressing RNNs for IoT devices by 15-38x using Kronecker Products

no code implementations • 7 Jun 2019 • Urmish Thakker, Jesse Beu, Dibakar Gope, Chu Zhou, Igor Fedorov, Ganesh Dasika, Matthew Mattina

Recurrent Neural Networks (RNN) can be difficult to deploy on resource constrained devices due to their size. As a result, there is a need for compression techniques that can significantly compress RNNs without negatively impacting task accuracy.

Paper
Add Code

Run-Time Efficient RNN Compression for Inference on Edge Devices

no code implementations • 12 Jun 2019 • Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Matthew Mattina

Recurrent neural networks can be large and compute-intensive, yet many applications that benefit from RNNs run on small devices with very limited compute and storage capabilities while still having run-time constraints.

Edge-computing

Paper
Add Code

A Static Analysis-based Cross-Architecture Performance Prediction Using Machine Learning

no code implementations • 18 Jun 2019 • Newsha Ardalani, Urmish Thakker, Aws Albarghouthi, Karu Sankaralingam

Porting code from CPU to GPU is costly and time-consuming; Unless much time is invested in development and optimization, it is not obvious, a priori, how much speed-up is achievable or how much room is left for improvement.

BIG-bench Machine Learning Binary Classification

Paper
Add Code

Pushing the limits of RNN Compression

no code implementations • 4 Oct 2019 • Urmish Thakker, Igor Fedorov, Jesse Beu, Dibakar Gope, Chu Zhou, Ganesh Dasika, Matthew Mattina

This paper introduces a method to compress RNNs for resource constrained environments using Kronecker product (KP).

Paper
Add Code

Ternary MobileNets via Per-Layer Hybrid Filter Banks

no code implementations • 4 Nov 2019 • Dibakar Gope, Jesse Beu, Urmish Thakker, Matthew Mattina

Using this proposed quantization method, we quantized a substantial portion of weight filters of MobileNets to ternary values resulting in 27. 98% savings in energy, and a 51. 07% reduction in the model size, while achieving comparable accuracy and no degradation in throughput on specialized hardware in comparison to the baseline full-precision MobileNets.

Quantization

Paper
Add Code

Compressing Language Models using Doped Kronecker Products

no code implementations • 24 Jan 2020 • Urmish Thakker, Paul N. Whatmough, Zhi-Gang Liu, Matthew Mattina, Jesse Beu

Kronecker Products (KP) have been used to compress IoT RNN Applications by 15-38x compression factors, achieving better results than traditional compression methods.

Language Modelling Large Language Model

Paper
Add Code

Federated Learning for Resource-Constrained IoT Devices: Panoramas and State-of-the-art

no code implementations • 25 Feb 2020 • Ahmed Imteaj, Urmish Thakker, Shiqiang Wang, Jian Li, M. Hadi Amini

Nowadays, devices are equipped with advanced sensors with higher processing/computing capabilities.

Federated Learning

Paper
Add Code

Rank and run-time aware compression of NLP Applications

no code implementations • EMNLP (sustainlp) 2020 • Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika, Matthew Mattina

We evaluate the impact of this technique on 5 NLP benchmarks across multiple tasks (Translation, Intent Detection, Language Modeling) and show that for similar accuracy values and compression factors, HMF can achieve more than 2. 32x faster inference run-time than pruning and 16. 77% better accuracy than LMF.

Intent Detection Language Modelling +1

Paper
Add Code

Doping: A technique for efficient compression of LSTM models using sparse structured additive matrices

no code implementations • 14 Feb 2021 • Urmish Thakker, Paul N. Whatmough, ZhiGang Liu, Matthew Mattina, Jesse Beu

Additionally, results with doped kronecker product matrices demonstrate state-of-the-art accuracy at large compression factors (10 - 25x) across 4 natural language processing applications with minor loss in accuracy.

Paper
Add Code

Training Large Language Models Efficiently with Sparsity and Dataflow

no code implementations • 11 Apr 2023 • Venkat Srinivasan, Darshan Gandhi, Urmish Thakker, Raghu Prabhakar

We show that we can successfully train GPT 13B to the same quality as the dense GPT 13B model, while achieving an end-end speedup of 4. 5x over dense A100 baseline.

Language Modelling Large Language Model +2

Paper
Add Code

Efficiently Adapting Pretrained Language Models To New Languages

no code implementations • 9 Nov 2023 • Zoltan Csaki, Pian Pawakapan, Urmish Thakker, Qiantong Xu

Recent large language models (LLM) exhibit sub-optimal performance on low-resource languages, as the training data of these models is usually dominated by English and other high-resource languages.

Cross-Lingual Transfer

Paper
Add Code

SambaLingo: Teaching Large Language Models New Languages

no code implementations • 8 Apr 2024 • Zoltan Csaki, Bo Li, Jonathan Li, Qiantong Xu, Pian Pawakapan, Leon Zhang, Yun Du, Hengyu Zhao, Changran Hu, Urmish Thakker

In this paper, we present a comprehensive investigation into the adaptation of LLMs to new languages.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.