GPU Kernels for Block-Sparse Weights

OpenAi 2017 β€’ Scott Gray β€’ Alec Radford and Diederik P. Kingma

We’re releasing highly optimized GPU kernels for an underexplored class of neural network architectures: networks with block-sparse weights. The kernels allow for efficient evaluation and differentiation of linear layers, including convolutional layers, with flexibly configurable block-sparsity patterns in the weight matrix... (read more)

PDF Abstract

Evaluation results from the paper


Task Dataset Model Metric name Metric value Global rank Compare
Sentiment Analysis CR Block-sparse LSTM Accuracy 92.2 # 1
Sentiment Analysis IMDb Block-sparse LSTM Accuracy 94.99 # 7
Sentiment Analysis SST-2 Binary classification Block-sparse LSTM Accuracy 93.2 # 8
Sentiment Analysis Yelp Binary classification Block-sparse LSTM Error 3.27 # 7