Efficient Winning Tickets Drawing over Fine-Grained Structured Sparsity

29 Sep 2021 · Sai Qian Zhang, Bradley McDanel ·

The fine-grained structured sparsity has been proposed as a middle-ground between unstructured sparsity, where weights are pruned independently, and coarse-grained structured sparsity, where entire blocks of weights are pruned. Specifically, N:M fine-grained structured sparsity allows for at most N nonzero weights across a group of M consecutive weights. A recent implementation of 2:4 sparsity (N=2 and M=4) in Sparse Tensor Cores of Nvidia A100 GPUs shows significant improvement in throughput compared to unstructured sparsity while maintaining similar performance (e.g., accuracy). However, despite its potential for superior computational performance, how to efficiently train DNNs with N:M fine-grained structured sparsity remains a challenging problem. In this work, we leverage the recent advance of~\textit{Lottery Ticket Hypothesis} (LTH) and propose an iterative pruning algorithm for N:M fine-grained structured sparsity. By leveraging the N:M sparsity constraint, we can identify the unimportant weights across each group of M weights at earlier stages of iterative pruning, which significantly lowers the cost of iterative training compared to conventional unstructured pruning.

PDF Abstract