L2 Regularization

28 papers with code • 0 benchmarks • 0 datasets

See Weight Decay.

$L_{2}$ Regularization or Weight Decay, is a regularization technique applied to the weights of a neural network. We minimize a loss function compromising both the primary loss function and a penalty on the $L_{2}$ Norm of the weights:

$$L_{new}\left(w\right) = L_{original}\left(w\right) + \lambda{w^{T}w}$$

where $\lambda$ is a value determining the strength of the penalty (encouraging smaller weights).

Weight decay can be incorporated directly into the weight update rule, rather than just implicitly by defining it through to objective function. Often weight decay refers to the implementation where we specify it directly in the weight update rule (whereas L2 regularization is usually the implementation which is specified in the objective function).

Most implemented papers

Understanding and Stabilizing GANs' Training Dynamics with Control Theory

taufikxu/GAN_PID 29 Sep 2019

There are existing efforts that model the training dynamics of GANs in the parameter space but the analysis cannot directly motivate practically effective stabilizing methods.

Data and Model Dependencies of Membership Inference Attack

SJabin/Data_Model_Dependencies_MIA 17 Feb 2020

Our results reveal the relationship between MIA accuracy and properties of the dataset and training model in use.

Distributionally Robust Neural Networks

kohpangwei/group_DRO ICLR 2020

Distributionally robust optimization (DRO) allows us to learn models that instead minimize the worst-case training loss over a set of pre-defined groups.

Label-Only Membership Inference Attacks

cchoquette/membership-inference 28 Jul 2020

We empirically show that our label-only membership inference attacks perform on par with prior attacks that required access to model confidences.

Neural Pruning via Growing Regularization

mingsun-tse/regularization-pruning ICLR 2021

Regularization has long been utilized to learn sparsity in deep neural network pruning.

Towards Unsupervised Deep Image Enhancement with Generative Adversarial Network

eezkni/UEGAN 30 Dec 2020

In this paper, we present an unsupervised image enhancement generative adversarial network (UEGAN), which learns the corresponding image-to-image mapping from a set of images with desired characteristics in an unsupervised manner, rather than learning on a large number of paired images.

Learning with Hyperspherical Uniformity

wy1iu/Sphere-Uniformity 2 Mar 2021

Due to the over-parameterization nature, neural networks are a powerful tool for nonlinear function approximation.

The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective

gpleiss/limits_of_large_width NeurIPS 2021

Our analysis in this paper decouples capacity and width via the generalization of neural networks to Deep Gaussian Processes (Deep GP), a class of nonparametric hierarchical models that subsume neural nets.

Sequence Length is a Domain: Length-based Overfitting in Transformer Models

LiamMaclean216/Pytorch-Transfomer EMNLP 2021

We demonstrate on a simple string editing task and a machine translation task that the Transformer model performance drops significantly when facing sequences of length diverging from the length distribution in the training data.

Disturbing Target Values for Neural Network Regularization

kimy-de/DisturbMethods 11 Oct 2021

This active regularization makes use of the model behavior during training to regularize it in a more directed manner.