DropConnect generalizes Dropout by randomly dropping the weights rather than the activations with probability $1-p$. DropConnect is similar to Dropout as it introduces dynamic sparsity within the model, but differs in that the sparsity is on the weights $W$, rather than the output vectors of a layer. In other words, the fully connected layer with DropConnect becomes a sparsely connected layer in which the connections are chosen at random during the training stage. Note that this is not equivalent to setting $W$ to be a fixed sparse matrix during training.
For a DropConnect layer, the output is given as:
$$ r = a \left(\left(M * W\right){v}\right)$$
Here $r$ is the output of a layer, $v$ is the input to a layer, $W$ are weight parameters, and $M$ is a binary matrix encoding the connection information where $M_{ij} \sim \text{Bernoulli}\left(p\right)$. Each element of the mask $M$ is drawn independently for each example during training, essentially instantiating a different connectivity for each example seen. Additionally, the biases are also masked out during training.
Source: Regularization of Neural Networks using DropConnectPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 20 | 10.53% |
Language Modeling | 18 | 9.47% |
Text Classification | 16 | 8.42% |
General Classification | 15 | 7.89% |
Classification | 11 | 5.79% |
Sentiment Analysis | 9 | 4.74% |
Image Classification | 8 | 4.21% |
Translation | 6 | 3.16% |
Language Identification | 4 | 2.11% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |