TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Network Pruning	ImageNet - ResNet 50 - 90% sparsity	Spartan	Top-1 Accuracy	76.17	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/spartan-differentiable-sparsity-via/network-pruning-on-imagenet-resnet-50-90)](https://paperswithcode.com/sota/network-pruning-on-imagenet-resnet-50-90?p=spartan-differentiable-sparsity-via)`

Spartan: Differentiable Sparsity via Regularized Transportation

27 May 2022 · Kai Sheng Tai, Taipeng Tian, Ser-Nam Lim ·

We present Spartan, a method for training sparse neural network models with a predetermined level of sparsity. Spartan is based on a combination of two techniques: (1) soft top-k masking of low-magnitude parameters via a regularized optimal transportation problem and (2) dual averaging-based parameter updates with hard sparsification in the forward pass. This scheme realizes an exploration-exploitation tradeoff: early in training, the learner is able to explore various sparsity patterns, and as the soft top-k approximation is gradually sharpened over the course of training, the balance shifts towards parameter optimization with respect to a fixed sparsity mask. Spartan is sufficiently flexible to accommodate a variety of sparsity allocation policies, including both unstructured and block structured sparsity, as well as general cost-sensitive sparsity allocation mediated by linear models of per-parameter costs. On ImageNet-1K classification, Spartan yields 95% sparse ResNet-50 models and 90% block sparse ViT-B/16 models while incurring absolute top-1 accuracy losses of less than 1% compared to fully dense training.

PDF Abstract