Search Results for author: Paul Whatmough

Found 18 papers, 8 papers with code

Characterizing Soft-Error Resiliency in Arm's Ethos-U55 Embedded Machine Learning Accelerator

no code implementations14 Apr 2024 Abhishek Tyagi, Reiley Jeyapaul, Chuteng Zhu, Paul Whatmough, Yuhao Zhu

As Neural Processing Units (NPU) or accelerators are increasingly deployed in a variety of applications including safety critical applications such as autonomous vehicle, and medical imaging, it is critical to understand the fault-tolerance nature of the NPUs.

Autonomous Vehicles Navigate

GPTVQ: The Blessing of Dimensionality for LLM Quantization

no code implementations23 Feb 2024 Mart van Baalen, Andrey Kuzmin, Markus Nagel, Peter Couperus, Cedric Bastoul, Eric Mahurin, Tijmen Blankevoort, Paul Whatmough

In this work we show that the size versus accuracy trade-off of neural network quantization can be significantly improved by increasing the quantization dimensionality.

Quantization

PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices

no code implementations26 Jan 2023 Yuji Chai, Devashree Tripathy, Chuteng Zhou, Dibakar Gope, Igor Fedorov, Ramon Matas, David Brooks, Gu-Yeon Wei, Paul Whatmough

The ability to accurately predict deep neural network (DNN) inference performance metrics, such as latency, power, and memory footprint, for an arbitrary DNN on a target hardware platform is essential to the design of DNN based models.

Thales: Formulating and Estimating Architectural Vulnerability Factors for DNN Accelerators

no code implementations5 Dec 2022 Abhishek Tyagi, Yiming Gan, Shaoshan Liu, Bo Yu, Paul Whatmough, Yuhao Zhu

As Deep Neural Networks (DNNs) are increasingly deployed in safety critical and privacy sensitive applications such as autonomous driving and biometric authentication, it is critical to understand the fault-tolerance nature of DNNs.

Autonomous Driving

Restructurable Activation Networks

1 code implementation17 Aug 2022 Kartikeya Bhardwaj, James Ward, Caleb Tung, Dibakar Gope, Lingchuan Meng, Igor Fedorov, Alex Chalfin, Paul Whatmough, Danny Loh

To address this question, we propose a new paradigm called Restructurable Activation Networks (RANs) that manipulate the amount of non-linearity in models to improve their hardware-awareness and efficiency.

object-detection Object Detection

Hybrid Cloud-Edge Networks for Efficient Inference

1 code implementation29 Sep 2021 Anil Kag, Igor Fedorov, Aditya Gangrade, Paul Whatmough, Venkatesh Saligrama

The first network is a low-capacity network that can be deployed on an edge device, whereas the second is a high-capacity network deployed in the cloud.

AutoPilot: Automating SoC Design Space Exploration for SWaP Constrained Autonomous UAVs

no code implementations5 Feb 2021 Srivatsan Krishnan, Zishen Wan, Kshitij Bhardwaj, Paul Whatmough, Aleksandra Faust, Sabrina Neuman, Gu-Yeon Wei, David Brooks, Vijay Janapa Reddi

Balancing a computing system for a UAV requires considering both the cyber (e. g., sensor rate, compute performance) and physical (e. g., payload weight) characteristics that affect overall performance.

Bayesian Optimization BIG-bench Machine Learning +1

Mesorasi: Architecture Support for Point Cloud Analytics via Delayed-Aggregation

1 code implementation16 Aug 2020 Yu Feng, Boyuan Tian, Tiancheng Xu, Paul Whatmough, Yuhao Zhu

Point cloud analytics is poised to become a key workload on battery-powered embedded and mobile platforms in a wide range of emerging application domains, such as autonomous driving, robotics, and augmented reality, where efficiency is paramount.

Autonomous Driving

CHIPKIT: An agile, reusable open-source framework for rapid test chip development

2 code implementations13 Jan 2020 Paul Whatmough, Marco Donato, Glenn Ko, Sae-Kyu Lee, David Brooks, Gu-Yeon Wei

The current trend for domain-specific architectures (DSAs) has led to renewed interest in research test chips to demonstrate new specialized hardware.

Hardware Architecture

SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads

no code implementations10 Dec 2019 Sam Likun Xi, Yuan YAO, Kshitij Bhardwaj, Paul Whatmough, Gu-Yeon Wei, David Brooks

In recent years, there has been tremendous advances in hardware acceleration of deep neural networks.

ASV: Accelerated Stereo Vision System

2 code implementations15 Nov 2019 Yu Feng, Paul Whatmough, Yuhao Zhu

The key to ASV is to exploit unique characteristics inherent to stereo vision, and apply stereo-specific optimizations, both algorithmically and computationally.

Stereo Matching

Energy Efficient Hardware for On-Device CNN Inference via Transfer Learning

no code implementations4 Dec 2018 Paul Whatmough, Chuteng Zhou, Patrick Hansen, Matthew Mattina

On-device CNN inference for real-time computer vision applications can result in computational demands that far exceed the energy budgets of mobile devices.

Image Classification Transfer Learning

SCALE-Sim: Systolic CNN Accelerator

8 code implementations16 Oct 2018 Ananda Samajdar, Yuhao Zhu, Paul Whatmough, Matthew Mattina, Tushar Krishna

Systolic Arrays are one of the most popular compute substrates within Deep Learning accelerators today, as they provide extremely high efficiency for running dense matrix multiplications.

Distributed, Parallel, and Cluster Computing Hardware Architecture

Euphrates: Algorithm-SoC Co-Design for Low-Power Mobile Continuous Vision

no code implementations29 Mar 2018 Yuhao Zhu, Anand Samajdar, Matthew Mattina, Paul Whatmough

Specifically, we propose to expose the motion data that is naturally generated by the Image Signal Processor (ISP) early in the vision pipeline to the CNN engine.

Mobile Machine Learning Hardware at ARM: A Systems-on-Chip (SoC) Perspective

no code implementations19 Jan 2018 Yuhao Zhu, Matthew Mattina, Paul Whatmough

Machine learning is playing an increasingly significant role in emerging mobile application domains such as AR/VR, ADAS, etc.

BIG-bench Machine Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.