GPU Kernels for Block-Sparse Weights

We’re releasing highly optimized GPU kernels for an underexplored class of neural network architectures: networks with block-sparse weights. The kernels allow for efficient evaluation and differentiation of linear layers, including convolutional layers, with flexibly configurable block-sparsity patterns in the weight matrix... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
BENCHMARK
Sentiment Analysis CR Block-sparse LSTM Accuracy 92.2 # 1
Sentiment Analysis IMDb Block-sparse LSTM Accuracy 94.99 # 10
Sentiment Analysis SST-2 Binary classification Block-sparse LSTM Accuracy 93.2 # 23
Sentiment Analysis Yelp Binary classification Block-sparse LSTM Error 3.27 # 10

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet