Search Results for author: Jeff Pool

Found 13 papers, 6 papers with code

Channel Permutations for N:M Sparsity

1 code implementation NeurIPS 2021 Jeff Pool, Chong Yu

We introduce channel permutations as a method to maximize the accuracy of N:M sparse networks.

Learning both Weights and Connections for Efficient Neural Networks

7 code implementations NeurIPS 2015 Song Han, Jeff Pool, John Tran, William J. Dally

On the ImageNet dataset, our method reduced the number of parameters of AlexNet by a factor of 9x, from 61 million to 6. 7 million, without incurring accuracy loss.

Efficient Sparse-Winograd Convolutional Neural Networks

1 code implementation ICLR 2018 Xingyu Liu, Jeff Pool, Song Han, William J. Dally

First, we move the ReLU operation into the Winograd domain to increase the sparsity of the transformed activations.

Network Pruning

Accelerating Sparse Deep Neural Networks

2 code implementations16 Apr 2021 Asit Mishra, Jorge Albericio Latorre, Jeff Pool, Darko Stosic, Dusan Stosic, Ganesh Venkatesh, Chong Yu, Paulius Micikevicius

We present the design and behavior of Sparse Tensor Cores, which exploit a 2:4 (50%) sparsity pattern that leads to twice the math throughput of dense matrix units.

Math

DSD: Dense-Sparse-Dense Training for Deep Neural Networks

2 code implementations15 Jul 2016 Song Han, Jeff Pool, Sharan Narang, Huizi Mao, Enhao Gong, Shijian Tang, Erich Elsen, Peter Vajda, Manohar Paluri, John Tran, Bryan Catanzaro, William J. Dally

We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance.

8k Caption Generation +3

Self-Supervised GAN Compression

1 code implementation3 Jul 2020 Chong Yu, Jeff Pool

Deep learning's success has led to larger and larger models to handle more and more complex tasks; trained models can contain millions of parameters.

Image Classification Model Compression

Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training

no code implementations1 Jun 2018 Maohua Zhu, Jason Clemons, Jeff Pool, Minsoo Rhu, Stephen W. Keckler, Yuan Xie

Further, we can enforce structured sparsity in the gate gradients to make the LSTM backward pass up to 45% faster than the state-of-the-art dense approach and 168% faster than the state-of-the-art sparsifying method on modern GPUs.

Sparse Persistent RNNs: Squeezing Large Recurrent Networks On-Chip

no code implementations ICLR 2018 Feiwen Zhu, Jeff Pool, Michael Andersch, Jeremy Appleyard, Fung Xie

Recurrent Neural Networks (RNNs) are powerful tools for solving sequence-based problems, but their efficacy and execution time are dependent on the size of the network.

NMT speech-recognition +1

Exploring the Regularity of Sparse Structure in Convolutional Neural Networks

no code implementations24 May 2017 Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, Xingyu Liu, Yu Wang, William J. Dally

Since memory reference is more than two orders of magnitude more expensive than arithmetic operations, the regularity of sparse structure leads to more efficient hardware design.

Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks

no code implementations3 May 2017 Minsoo Rhu, Mike O'Connor, Niladrish Chatterjee, Jeff Pool, Stephen W. Keckler

Popular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory.

Learning both Weights and Connections for Efficient Neural Network

no code implementations NeurIPS 2015 Song Han, Jeff Pool, John Tran, William Dally

On the ImageNet dataset, our method reduced the number of parameters of AlexNet by a factor of 9×, from 61 million to 6. 7 million, without incurring accuracy loss.

Efficient Neural Network

Buddy Compression: Enabling Larger Memory for Deep Learning and HPC Workloads on GPUs

no code implementations6 Mar 2019 Esha Choukse, Michael Sullivan, Mike O'Connor, Mattan Erez, Jeff Pool, David Nellans, Steve Keckler

However, GPU device memory tends to be relatively small and the memory capacity can not be increased by the user.

Hardware Architecture

Self-Supervised Generative Adversarial Compression

no code implementations NeurIPS 2020 Chong Yu, Jeff Pool

Deep learning’s success has led to larger and larger models to handle more and more complex tasks; trained models often contain millions of parameters.

Image Classification Knowledge Distillation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.