Search Results for author: Stephen W. Keckler

Found 12 papers, 1 papers with code

Abstracting Sparse DNN Acceleration via Structured Sparse Tensor Decomposition

no code implementations12 Mar 2024 Geonhwa Jeong, Po-An Tsai, Abhimanyu R. Bambhaniya, Stephen W. Keckler, Tushar Krishna

Next, we develop a software framework, TASDER, to accelerate DNNs by searching layer-wise, high-quality structured decomposition for both weight and activation tensors so that they can be accelerated by any systems with structured sparse hardware support.

Tensor Decomposition

Vision Transformer Computation and Resilience for Dynamic Inference

no code implementations6 Dec 2022 Kavya Sreedhar, Jason Clemons, Rangharajan Venkatesan, Stephen W. Keckler, Mark Horowitz

To create dynamic models, we leverage the resilience of vision transformers to pruning and switch between different scaled versions of a model.

Semantic Segmentation

Zhuyi: Perception Processing Rate Estimation for Safety in Autonomous Vehicles

no code implementations6 May 2022 Yu-Shun Hsiao, Siva Kumar Sastry Hari, Michał Filipiuk, Timothy Tsai, Michael B. Sullivan, Vijay Janapa Reddi, Vasu Singh, Stephen W. Keckler

The processing requirement of autonomous vehicles (AVs) for high-accuracy perception in complex scenarios can exceed the resources offered by the in-vehicle computer, degrading safety and comfort.

Autonomous Vehicles

GPU Domain Specialization via Composable On-Package Architecture

no code implementations5 Apr 2021 Yaosheng Fu, Evgeny Bolotin, Niladrish Chatterjee, David Nellans, Stephen W. Keckler

As GPUs scale their low precision matrix math throughput to boost deep learning (DL) performance, they upset the balance between math throughput and memory system capabilities.

Math

Generating and Characterizing Scenarios for Safety Testing of Autonomous Vehicles

no code implementations12 Mar 2021 Zahra Ghodsi, Siva Kumar Sastry Hari, Iuri Frosio, Timothy Tsai, Alejandro Troccoli, Stephen W. Keckler, Siddharth Garg, Anima Anandkumar

Extracting interesting scenarios from real-world data as well as generating failure cases is important for the development and testing of autonomous systems.

Autonomous Vehicles

Making Convolutions Resilient via Algorithm-Based Error Detection Techniques

no code implementations8 Jun 2020 Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, Stephen W. Keckler

The ability of Convolutional Neural Networks (CNNs) to accurately process real-time telemetry has boosted their use in safety-critical and high-performance computing systems.

HarDNN: Feature Map Vulnerability Evaluation in CNNs

no code implementations22 Feb 2020 Abdulrahman Mahmoud, Siva Kumar Sastry Hari, Christopher W. Fletcher, Sarita V. Adve, Charbel Sakr, Naresh Shanbhag, Pavlo Molchanov, Michael B. Sullivan, Timothy Tsai, Stephen W. Keckler

As Convolutional Neural Networks (CNNs) are increasingly being employed in safety-critical applications, it is important that they behave reliably in the face of hardware errors.

Decision Making

Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training

no code implementations1 Jun 2018 Maohua Zhu, Jason Clemons, Jeff Pool, Minsoo Rhu, Stephen W. Keckler, Yuan Xie

Further, we can enforce structured sparsity in the gate gradients to make the LSTM backward pass up to 45% faster than the state-of-the-art dense approach and 168% faster than the state-of-the-art sparsifying method on modern GPUs.

Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

no code implementations3 May 2017 Minsoo Rhu, Mike O'Connor, Niladrish Chatterjee, Jeff Pool, Stephen W. Keckler

Popular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory.

vDNN: Virtualized Deep Neural Networks for Scalable, Memory-Efficient Neural Network Design

4 code implementations25 Feb 2016 Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, Stephen W. Keckler

The most widely used machine learning frameworks require users to carefully tune their memory usage so that the deep neural network (DNN) fits into the DRAM capacity of a GPU.

BIG-bench Machine Learning Efficient Neural Network

Cannot find the paper you are looking for? You can Submit a new open access paper.