# Sparse Learning

33 papers with code • 3 benchmarks • 3 datasets

## Libraries

Use these libraries to find Sparse Learning models and implementations## Most implemented papers

# Variational Dropout Sparsifies Deep Neural Networks

We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout.

# The State of Sparsity in Deep Neural Networks

We rigorously evaluate three state-of-the-art techniques for inducing sparsity in deep neural networks on two large-scale learning tasks: Transformer trained on WMT 2014 English-to-German, and ResNet-50 trained on ImageNet.

# A General Iterative Shrinkage and Thresholding Algorithm for Non-convex Regularized Optimization Problems

A commonly used approach is the Multi-Stage (MS) convex relaxation (or DC programming), which relaxes the original non-convex problem to a sequence of convex problems.

# Rigging the Lottery: Making All Tickets Winners

There is a large body of work on training dense networks to yield sparse networks for inference, but this limits the size of the largest trainable sparse model to that of the largest trainable dense model.

# Sparse Networks from Scratch: Faster Training without Losing Performance

We demonstrate the possibility of what we call sparse learning: accelerated training of deep neural networks that maintain sparse weights throughout training while achieving dense performance levels.

# Sparse Regression at Scale: Branch-and-Bound rooted in First-Order Optimization

In this work, we present a new exact MIP framework for $\ell_0\ell_2$-regularized regression that can scale to $p \sim 10^7$, achieving speedups of at least $5000$x, compared to state-of-the-art exact methods.

# Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training

By starting from a random sparse network and continuously exploring sparse connectivities during training, we can perform an Over-Parameterization in the space-time manifold, closing the gap in the expressibility between sparse training and dense training.

# Feature Selection: A Data Perspective

To facilitate and promote the research in this community, we also present an open-source feature selection repository that consists of most of the popular feature selection algorithms (\url{http://featureselection. asu. edu/}).

# SparseStep: Approximating the Counting Norm for Sparse Regularization

The SparseStep algorithm is presented for the estimation of a sparse parameter vector in the linear regression problem.

# From safe screening rules to working sets for faster Lasso-type solvers

For the Lasso estimator a WS is a set of features, while for a Group Lasso it refers to a set of groups.