Search Results for author: Paul N. Whatmough

Found 17 papers, 5 papers with code

FixyNN: Efficient Hardware for Mobile Computer Vision via Transfer Learning

1 code implementation • 27 Feb 2019 • Paul N. Whatmough, Chuteng Zhou, Patrick Hansen, Shreyas Kolala Venkataramanaiah, Jae-sun Seo, Matthew Mattina

Over a suite of six datasets we trained models via transfer learning with an accuracy loss of $<1\%$ resulting in up to 11. 2 TOPS/W - nearly $2 \times$ more efficient than a conventional programmable CNN accelerator of the same area.

General Classification Image Classification +1

Paper
Code

SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers

no code implementations • NeurIPS 2019 • Igor Fedorov, Ryan P. Adams, Matthew Mattina, Paul N. Whatmough

The vast majority of processors in the world are actually microcontroller units (MCUs), which find widespread use performing simple control tasks in applications ranging from automobiles to medical devices and office equipment.

BIG-bench Machine Learning Neural Architecture Search

Paper
Add Code

ISP4ML: Understanding the Role of Image Signal Processing in Efficient Deep Learning Vision Systems

no code implementations • 18 Nov 2019 • Patrick Hansen, Alexey Vilkin, Yury Khrustalev, James Imber, David Hanwell, Matthew Mattina, Paul N. Whatmough

In this work, we investigate the efficacy of the ISP in CNN classification tasks, and outline the system-level trade-offs between prediction accuracy and computational cost.

Paper
Add Code

Noisy Machines: Understanding Noisy Neural Networks and Enhancing Robustness to Analog Hardware Errors Using Distillation

no code implementations • 14 Jan 2020 • Chuteng Zhou, Prad Kadambi, Matthew Mattina, Paul N. Whatmough

Hence, for successful deployment on analog accelerators, it is essential to be able to train deep neural networks to be robust to random continuous noise in the network weights, which is a somewhat new challenge in machine learning.

Knowledge Distillation

Paper
Add Code

Compressing Language Models using Doped Kronecker Products

no code implementations • 24 Jan 2020 • Urmish Thakker, Paul N. Whatmough, Zhi-Gang Liu, Matthew Mattina, Jesse Beu

Kronecker Products (KP) have been used to compress IoT RNN Applications by 15-38x compression factors, achieving better results than traditional compression methods.

Language Modelling Large Language Model

Paper
Add Code

Searching for Winograd-aware Quantized Networks

1 code implementation • 25 Feb 2020 • Javier Fernandez-Marques, Paul N. Whatmough, Andrew Mundy, Matthew Mattina

Lightweight architectural designs of Convolutional Neural Networks (CNNs) together with quantization have paved the way for the deployment of demanding computer vision applications on mobile devices.

Neural Architecture Search Quantization

Paper
Code

Systolic Tensor Array: An Efficient Structured-Sparse GEMM Accelerator for Mobile CNN Inference

no code implementations • 16 May 2020 • Zhi-Gang Liu, Paul N. Whatmough, Matthew Mattina

Convolutional neural network (CNN) inference on mobile devices demands efficient hardware acceleration of low-precision (INT8) general matrix multiplication (GEMM).

Paper
Add Code

TinyLSTMs: Efficient Neural Speech Enhancement for Hearing Aids

1 code implementation • 20 May 2020 • Igor Fedorov, Marko Stamenovic, Carl Jensen, Li-Chia Yang, Ari Mandell, Yiming Gan, Matthew Mattina, Paul N. Whatmough

Modern speech enhancement algorithms achieve remarkable noise suppression by means of large recurrent neural networks (RNNs).

Model Compression Quantization +1

Paper
Code

Sparse Systolic Tensor Array for Efficient CNN Hardware Acceleration

no code implementations • 4 Sep 2020 • Zhi-Gang Liu, Paul N. Whatmough, Matthew Mattina

In this paper, we address a key architectural challenge with structural sparsity: how to provide support for a range of sparsity levels while maintaining high utilization of the hardware.

Paper
Add Code

MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity Microcontrollers

1 code implementation • 21 Oct 2020 • Colby Banbury, Chuteng Zhou, Igor Fedorov, Ramon Matas Navarro, Urmish Thakker, Dibakar Gope, Vijay Janapa Reddi, Matthew Mattina, Paul N. Whatmough

To address this challenge, neural architecture search (NAS) promises to help design accurate ML models that meet the tight MCU memory, latency and energy constraints.

Ranked #1 on Keyword Spotting on Google Speech Commands V2 12

Anomaly Detection Keyword Spotting +1

177

Paper
Code

EdgeBERT: Sentence-Level Energy Optimizations for Latency-Aware Multi-Task NLP Inference

no code implementations • 28 Nov 2020 • Thierry Tambe, Coleman Hooper, Lillian Pentecost, Tianyu Jia, En-Yu Yang, Marco Donato, Victor Sanh, Paul N. Whatmough, Alexander M. Rush, David Brooks, Gu-Yeon Wei

Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks.

Edge-computing Network Pruning +2

Paper
Add Code

Information contraction in noisy binary neural networks and its implications

no code implementations • 28 Jan 2021 • Chuteng Zhou, Quntao Zhuang, Matthew Mattina, Paul N. Whatmough

Our SDPI can be applied to various information processing systems, including neural networks and cellular automata.

Image Classification object-detection +1

Paper
Add Code

Doping: A technique for efficient compression of LSTM models using sparse structured additive matrices

no code implementations • 14 Feb 2021 • Urmish Thakker, Paul N. Whatmough, ZhiGang Liu, Matthew Mattina, Jesse Beu

Additionally, results with doped kronecker product matrices demonstrate state-of-the-art accuracy at large compression factors (10 - 25x) across 4 natural language processing applications with minor loss in accuracy.

Paper
Add Code

Fast and Accurate: Video Enhancement using Sparse Depth

no code implementations • 15 Mar 2021 • Yu Feng, Patrick Hansen, Paul N. Whatmough, Guoyu Lu, Yuhao Zhu

This paper presents a general framework to build fast and accurate algorithms for video enhancement tasks such as super-resolution, deblurring, and denoising.

Deblurring Denoising +4

Paper
Add Code

S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN Acceleration

no code implementations • 16 Jul 2021 • Zhi-Gang Liu, Paul N. Whatmough, Yuhao Zhu, Matthew Mattina

We propose to exploit structured sparsity, more specifically, Density Bound Block (DBB) sparsity for both weights and activations.

Paper
Add Code

Federated Learning Based on Dynamic Regularization

3 code implementations • ICLR 2021 • Durmus Alp Emre Acar, Yue Zhao, Ramon Matas Navarro, Matthew Mattina, Paul N. Whatmough, Venkatesh Saligrama

We propose a novel federated learning method for distributively training neural network models, where the server orchestrates cooperation between a subset of randomly chosen devices in each round.

Federated Learning

1,136

Paper
Code

AnalogNets: ML-HW Co-Design of Noise-robust TinyML Models and Always-On Analog Compute-in-Memory Accelerator

no code implementations • 10 Nov 2021 • Chuteng Zhou, Fernando Garcia Redondo, Julian Büchel, Irem Boybat, Xavier Timoneda Comas, S. R. Nandakumar, Shidhartha Das, Abu Sebastian, Manuel Le Gallo, Paul N. Whatmough

We also describe AON-CiM, a programmable, minimal-area phase-change memory (PCM) analog CiM accelerator, with a novel layer-serial approach to remove the cost of complex interconnects associated with a fully-pipelined design.

Keyword Spotting

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.