no code implementations • 10 Nov 2021 • Chuteng Zhou, Fernando Garcia Redondo, Julian Büchel, Irem Boybat, Xavier Timoneda Comas, S. R. Nandakumar, Shidhartha Das, Abu Sebastian, Manuel Le Gallo, Paul N. Whatmough
We also describe AON-CiM, a programmable, minimal-area phase-change memory (PCM) analog CiM accelerator, with a novel layer-serial approach to remove the cost of complex interconnects associated with a fully-pipelined design.
3 code implementations • ICLR 2021 • Durmus Alp Emre Acar, Yue Zhao, Ramon Matas Navarro, Matthew Mattina, Paul N. Whatmough, Venkatesh Saligrama
We propose a novel federated learning method for distributively training neural network models, where the server orchestrates cooperation between a subset of randomly chosen devices in each round.
no code implementations • 16 Jul 2021 • Zhi-Gang Liu, Paul N. Whatmough, Yuhao Zhu, Matthew Mattina
We propose to exploit structured sparsity, more specifically, Density Bound Block (DBB) sparsity for both weights and activations.
no code implementations • 15 Mar 2021 • Yu Feng, Patrick Hansen, Paul N. Whatmough, Guoyu Lu, Yuhao Zhu
This paper presents a general framework to build fast and accurate algorithms for video enhancement tasks such as super-resolution, deblurring, and denoising.
no code implementations • 14 Feb 2021 • Urmish Thakker, Paul N. Whatmough, ZhiGang Liu, Matthew Mattina, Jesse Beu
Additionally, results with doped kronecker product matrices demonstrate state-of-the-art accuracy at large compression factors (10 - 25x) across 4 natural language processing applications with minor loss in accuracy.
no code implementations • 28 Jan 2021 • Chuteng Zhou, Quntao Zhuang, Matthew Mattina, Paul N. Whatmough
Our SDPI can be applied to various information processing systems, including neural networks and cellular automata.
no code implementations • 28 Nov 2020 • Thierry Tambe, Coleman Hooper, Lillian Pentecost, Tianyu Jia, En-Yu Yang, Marco Donato, Victor Sanh, Paul N. Whatmough, Alexander M. Rush, David Brooks, Gu-Yeon Wei
Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks.
1 code implementation • 21 Oct 2020 • Colby Banbury, Chuteng Zhou, Igor Fedorov, Ramon Matas Navarro, Urmish Thakker, Dibakar Gope, Vijay Janapa Reddi, Matthew Mattina, Paul N. Whatmough
To address this challenge, neural architecture search (NAS) promises to help design accurate ML models that meet the tight MCU memory, latency and energy constraints.
Ranked #1 on Keyword Spotting on Google Speech Commands V2 12
no code implementations • 4 Sep 2020 • Zhi-Gang Liu, Paul N. Whatmough, Matthew Mattina
In this paper, we address a key architectural challenge with structural sparsity: how to provide support for a range of sparsity levels while maintaining high utilization of the hardware.
1 code implementation • 20 May 2020 • Igor Fedorov, Marko Stamenovic, Carl Jensen, Li-Chia Yang, Ari Mandell, Yiming Gan, Matthew Mattina, Paul N. Whatmough
Modern speech enhancement algorithms achieve remarkable noise suppression by means of large recurrent neural networks (RNNs).
no code implementations • 16 May 2020 • Zhi-Gang Liu, Paul N. Whatmough, Matthew Mattina
Convolutional neural network (CNN) inference on mobile devices demands efficient hardware acceleration of low-precision (INT8) general matrix multiplication (GEMM).
1 code implementation • 25 Feb 2020 • Javier Fernandez-Marques, Paul N. Whatmough, Andrew Mundy, Matthew Mattina
Lightweight architectural designs of Convolutional Neural Networks (CNNs) together with quantization have paved the way for the deployment of demanding computer vision applications on mobile devices.
no code implementations • 24 Jan 2020 • Urmish Thakker, Paul N. Whatmough, Zhi-Gang Liu, Matthew Mattina, Jesse Beu
Kronecker Products (KP) have been used to compress IoT RNN Applications by 15-38x compression factors, achieving better results than traditional compression methods.
no code implementations • 14 Jan 2020 • Chuteng Zhou, Prad Kadambi, Matthew Mattina, Paul N. Whatmough
Hence, for successful deployment on analog accelerators, it is essential to be able to train deep neural networks to be robust to random continuous noise in the network weights, which is a somewhat new challenge in machine learning.
no code implementations • 18 Nov 2019 • Patrick Hansen, Alexey Vilkin, Yury Khrustalev, James Imber, David Hanwell, Matthew Mattina, Paul N. Whatmough
In this work, we investigate the efficacy of the ISP in CNN classification tasks, and outline the system-level trade-offs between prediction accuracy and computational cost.
no code implementations • NeurIPS 2019 • Igor Fedorov, Ryan P. Adams, Matthew Mattina, Paul N. Whatmough
The vast majority of processors in the world are actually microcontroller units (MCUs), which find widespread use performing simple control tasks in applications ranging from automobiles to medical devices and office equipment.
1 code implementation • 27 Feb 2019 • Paul N. Whatmough, Chuteng Zhou, Patrick Hansen, Shreyas Kolala Venkataramanaiah, Jae-sun Seo, Matthew Mattina
Over a suite of six datasets we trained models via transfer learning with an accuracy loss of $<1\%$ resulting in up to 11. 2 TOPS/W - nearly $2 \times$ more efficient than a conventional programmable CNN accelerator of the same area.