Search Results for author: Paul Whatmough

Found 18 papers, 8 papers with code

Characterizing Soft-Error Resiliency in Arm's Ethos-U55 Embedded Machine Learning Accelerator

no code implementations • 14 Apr 2024 • Abhishek Tyagi, Reiley Jeyapaul, Chuteng Zhu, Paul Whatmough, Yuhao Zhu

As Neural Processing Units (NPU) or accelerators are increasingly deployed in a variety of applications including safety critical applications such as autonomous vehicle, and medical imaging, it is critical to understand the fault-tolerance nature of the NPUs.

Autonomous Vehicles Navigate

Paper
Add Code

GPTVQ: The Blessing of Dimensionality for LLM Quantization

no code implementations • 23 Feb 2024 • Mart van Baalen, Andrey Kuzmin, Markus Nagel, Peter Couperus, Cedric Bastoul, Eric Mahurin, Tijmen Blankevoort, Paul Whatmough

In this work we show that the size versus accuracy trade-off of neural network quantization can be significantly improved by increasing the quantization dimensionality.

Quantization

Paper
Add Code

Efficient Edge Inference by Selective Query

1 code implementation • International Conference on Learning Representations 2023 • Anil Kag, Igor Fedorov, Aditya Gangrade, Paul Whatmough, Venkatesh Saligrama

Training a hybrid learner is difficult since we lack annotations of hard edge-examples.

Paper
Code

PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices

no code implementations • 26 Jan 2023 • Yuji Chai, Devashree Tripathy, Chuteng Zhou, Dibakar Gope, Igor Fedorov, Ramon Matas, David Brooks, Gu-Yeon Wei, Paul Whatmough

The ability to accurately predict deep neural network (DNN) inference performance metrics, such as latency, power, and memory footprint, for an arbitrary DNN on a target hardware platform is essential to the design of DNN based models.

Paper
Add Code

Thales: Formulating and Estimating Architectural Vulnerability Factors for DNN Accelerators

no code implementations • 5 Dec 2022 • Abhishek Tyagi, Yiming Gan, Shaoshan Liu, Bo Yu, Paul Whatmough, Yuhao Zhu

As Deep Neural Networks (DNNs) are increasingly deployed in safety critical and privacy sensitive applications such as autonomous driving and biometric authentication, it is critical to understand the fault-tolerance nature of DNNs.

Autonomous Driving

Paper
Add Code

Restructurable Activation Networks

1 code implementation • 17 Aug 2022 • Kartikeya Bhardwaj, James Ward, Caleb Tung, Dibakar Gope, Lingchuan Meng, Igor Fedorov, Alex Chalfin, Paul Whatmough, Danny Loh

To address this question, we propose a new paradigm called Restructurable Activation Networks (RANs) that manipulate the amount of non-linearity in models to improve their hardware-awareness and efficiency.

object-detection Object Detection

Paper
Code

UDC: Unified DNAS for Compressible TinyML Models

no code implementations • 15 Jan 2022 • Igor Fedorov, Ramon Matas, Hokchhay Tann, Chuteng Zhou, Matthew Mattina, Paul Whatmough

Deploying TinyML models on low-cost IoT hardware is very challenging, due to limited device memory capacity.

Model Compression Neural Architecture Search +2

Paper
Add Code

Super-Efficient Super Resolution for Fast Adversarial Defense at the Edge

1 code implementation • 29 Dec 2021 • Kartikeya Bhardwaj, Dibakar Gope, James Ward, Paul Whatmough, Danny Loh

Autonomous systems are highly vulnerable to a variety of adversarial attacks on Deep Neural Networks (DNNs).

Adversarial Defense Image Classification +1

Paper
Code

Hybrid Cloud-Edge Networks for Efficient Inference

1 code implementation • 29 Sep 2021 • Anil Kag, Igor Fedorov, Aditya Gangrade, Paul Whatmough, Venkatesh Saligrama

The first network is a low-capacity network that can be deployed on an edge device, whereas the second is a high-capacity network deployed in the cloud.

Paper
Code

AutoPilot: Automating SoC Design Space Exploration for SWaP Constrained Autonomous UAVs

no code implementations • 5 Feb 2021 • Srivatsan Krishnan, Zishen Wan, Kshitij Bhardwaj, Paul Whatmough, Aleksandra Faust, Sabrina Neuman, Gu-Yeon Wei, David Brooks, Vijay Janapa Reddi

Balancing a computing system for a UAV requires considering both the cyber (e. g., sensor rate, compute performance) and physical (e. g., payload weight) characteristics that affect overall performance.

Bayesian Optimization BIG-bench Machine Learning +1

Paper
Add Code

Mesorasi: Architecture Support for Point Cloud Analytics via Delayed-Aggregation

1 code implementation • 16 Aug 2020 • Yu Feng, Boyuan Tian, Tiancheng Xu, Paul Whatmough, Yuhao Zhu

Point cloud analytics is poised to become a key workload on battery-powered embedded and mobile platforms in a wide range of emerging application domains, such as autonomous driving, robotics, and augmented reality, where efficiency is paramount.

Autonomous Driving

Paper
Code

CHIPKIT: An agile, reusable open-source framework for rapid test chip development

2 code implementations • 13 Jan 2020 • Paul Whatmough, Marco Donato, Glenn Ko, Sae-Kyu Lee, David Brooks, Gu-Yeon Wei

The current trend for domain-specific architectures (DSAs) has led to renewed interest in research test chips to demonstrate new specialized hardware.

Hardware Architecture

Paper
Code

SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads

no code implementations • 10 Dec 2019 • Sam Likun Xi, Yuan YAO, Kshitij Bhardwaj, Paul Whatmough, Gu-Yeon Wei, David Brooks

In recent years, there has been tremendous advances in hardware acceleration of deep neural networks.

Paper
Add Code

ASV: Accelerated Stereo Vision System

2 code implementations • 15 Nov 2019 • Yu Feng, Paul Whatmough, Yuhao Zhu

The key to ASV is to exploit unique characteristics inherent to stereo vision, and apply stereo-specific optimizations, both algorithmically and computationally.

Stereo Matching

Paper
Code

Energy Efficient Hardware for On-Device CNN Inference via Transfer Learning

no code implementations • 4 Dec 2018 • Paul Whatmough, Chuteng Zhou, Patrick Hansen, Matthew Mattina

On-device CNN inference for real-time computer vision applications can result in computational demands that far exceed the energy budgets of mobile devices.

Image Classification Transfer Learning

Paper
Add Code

SCALE-Sim: Systolic CNN Accelerator

8 code implementations • 16 Oct 2018 • Ananda Samajdar, Yuhao Zhu, Paul Whatmough, Matthew Mattina, Tushar Krishna

Systolic Arrays are one of the most popular compute substrates within Deep Learning accelerators today, as they provide extremely high efficiency for running dense matrix multiplications.

Distributed, Parallel, and Cluster Computing Hardware Architecture

314

Paper
Code

Euphrates: Algorithm-SoC Co-Design for Low-Power Mobile Continuous Vision

no code implementations • 29 Mar 2018 • Yuhao Zhu, Anand Samajdar, Matthew Mattina, Paul Whatmough

Specifically, we propose to expose the motion data that is naturally generated by the Image Signal Processor (ISP) early in the vision pipeline to the CNN engine.

Paper
Add Code

Mobile Machine Learning Hardware at ARM: A Systems-on-Chip (SoC) Perspective

no code implementations • 19 Jan 2018 • Yuhao Zhu, Matthew Mattina, Paul Whatmough

Machine learning is playing an increasingly significant role in emerging mobile application domains such as AR/VR, ADAS, etc.

BIG-bench Machine Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.