Search Results for author: Ardavan Pedram

Found 6 papers, 2 papers with code

Rethinking Floating Point Overheads for Mixed Precision DNN Accelerators

no code implementations • 27 Jan 2021 • Hamzah Abdel-Aziz, Ali Shafiee, Jong Hoon Shin, Ardavan Pedram, Joseph H. Hassoun

We present novel optimizations based on the above observations to reduce the FP arithmetic hardware overheads.

Paper
Add Code

Campfire: Compressible, Regularization-Free, Structured Sparse Training for Hardware Accelerators

no code implementations • 9 Jan 2020 • Noah Gamboa, Kais Kudrolli, Anand Dhoot, Ardavan Pedram

This paper studies structured sparse training of CNNs with a gradual pruning technique that leads to fixed, sparse weight matrices after a set number of epochs.

Paper
Add Code

Starfire: Regularization-Free Adversarially-Robust Structured Sparse Training

no code implementations • 25 Sep 2019 • Noah Gamboa, Kais Kudrolli, Anand Dhoot, Ardavan Pedram

We show that our method creates a sparse version of ResNet50 and ResNet50v1. 5 on full ImageNet while remaining within a negligible <1% margin of accuracy loss.

Paper
Add Code

CATERPILLAR: Coarse Grain Reconfigurable Architecture for Accelerating the Training of Deep Neural Networks

no code implementations • 1 Jun 2017 • Yuan-Fang Li, Ardavan Pedram

Our results suggest that smaller networks favor non-batched techniques while performance for larger networks is higher using batched operations.

Paper
Add Code

A Systematic Approach to Blocking Convolutional Neural Networks

1 code implementation • 14 Jun 2016 • Xuan Yang, Jing Pu, Blaine Burton Rister, Nikhil Bhagdikar, Stephen Richardson, Shahar Kvatinsky, Jonathan Ragan-Kelley, Ardavan Pedram, Mark Horowitz

Convolutional Neural Networks (CNNs) are the state of the art solution for many computer vision problems, and many researchers have explored optimized implementations.

Blocking

205

Paper
Code

EIE: Efficient Inference Engine on Compressed Deep Neural Network

4 code implementations • 4 Feb 2016 • Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A. Horowitz, William J. Dally

EIE has a processing power of 102GOPS/s working directly on a compressed network, corresponding to 3TOPS/s on an uncompressed network, and processes FC layers of AlexNet at 1. 88x10^4 frames/sec with a power dissipation of only 600mW.

644

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.