Introduced by Wan et al. in Regularization of Neural Networks using DropConnect

DropConnect generalizes Dropout by randomly dropping the weights rather than the activations with probability $1-p$. DropConnect is similar to Dropout as it introduces dynamic sparsity within the model, but differs in that the sparsity is on the weights $W$, rather than the output vectors of a layer. In other words, the fully connected layer with DropConnect becomes a sparsely connected layer in which the connections are chosen at random during the training stage. Note that this is not equivalent to setting $W$ to be a fixed sparse matrix during training.

For a DropConnect layer, the output is given as:

$$ r = a \left(\left(M * W\right){v}\right)$$

Here $r$ is the output of a layer, $v$ is the input to a layer, $W$ are weight parameters, and $M$ is a binary matrix encoding the connection information where $M_{ij} \sim \text{Bernoulli}\left(p\right)$. Each element of the mask $M$ is drawn independently for each example during training, essentially instantiating a different connectivity for each example seen. Additionally, the biases are also masked out during training.

Source: Regularization of Neural Networks using DropConnect


Paper Code Results Date Stars


Task Papers Share
Language Modelling 20 13.16%
General Classification 15 9.87%
Text Classification 14 9.21%
Classification 10 6.58%
Sentiment Analysis 8 5.26%
Image Classification 7 4.61%
Test 7 4.61%
Translation 6 3.95%
Language Identification 4 2.63%


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign