40 papers with code • 3 benchmarks • 3 datasets
A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems
A commonly used approach is the Multi-Stage (MS) convex relaxation (or DC programming), which relaxes the original non-convex problem to a sequence of convex problems.
By starting from a random sparse network and continuously exploring sparse connectivities during training, we can perform an Over-Parameterization in the space-time manifold, closing the gap in the expressibility between sparse training and dense training.
We demonstrate the possibility of what we call sparse learning: accelerated training of deep neural networks that maintain sparse weights throughout training while achieving dense performance levels.
In this work, we present a new exact MIP framework for $\ell_0\ell_2$-regularized regression that can scale to $p \sim 10^7$, achieving speedups of at least $5000$x, compared to state-of-the-art exact methods.
Works on lottery ticket hypothesis (LTH) and single-shot network pruning (SNIP) have raised a lot of attention currently on post-training pruning (iterative magnitude pruning), and before-training pruning (pruning at initialization).