L2 Regularization
29 papers with code • 0 benchmarks • 0 datasets
See Weight Decay.
$L_{2}$ Regularization or Weight Decay, is a regularization technique applied to the weights of a neural network. We minimize a loss function compromising both the primary loss function and a penalty on the $L_{2}$ Norm of the weights:
$$L_{new}\left(w\right) = L_{original}\left(w\right) + \lambda{w^{T}w}$$
where $\lambda$ is a value determining the strength of the penalty (encouraging smaller weights).
Weight decay can be incorporated directly into the weight update rule, rather than just implicitly by defining it through to objective function. Often weight decay refers to the implementation where we specify it directly in the weight update rule (whereas L2 regularization is usually the implementation which is specified in the objective function).
Benchmarks
These leaderboards are used to track progress in L2 Regularization
Most implemented papers
Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines
Continual learning has received a great deal of attention recently with several approaches being proposed.
Convolutional Neural Networks for Facial Expression Recognition
We have developed convolutional neural networks (CNN) for a facial expression recognition task.
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
This study investigates how weight decay affects the update behavior of individual neurons in deep neural networks through a combination of applied analysis and experimentation.
The Transient Nature of Emergent In-Context Learning in Transformers
The transient nature of ICL is observed in transformers across a range of model sizes and datasets, raising the question of how much to "overtrain" transformers when seeking compact, cheaper-to-run models.
On Regularization Parameter Estimation under Covariate Shift
This paper identifies a problem with the usual procedure for L2-regularization parameter estimation in a domain adaptation setting.
Neurogenesis-Inspired Dictionary Learning: Online Model Adaption in a Changing World
In this paper, we focus on online representation learning in non-stationary environments which may require continuous adaptation of model architecture.
Collaboratively Weighting Deep and Classic Representation via L2 Regularization for Image Classification
We propose a deep collaborative weight-based classification (DeepCWC) method to resolve this problem, by providing a novel option to fully take advantage of deep features in classic machine learning.
Quantifying Generalization in Reinforcement Learning
In this paper, we investigate the problem of overfitting in deep reinforcement learning.
What is the Effect of Importance Weighting in Deep Learning?
Importance-weighted risk minimization is a key ingredient in many machine learning algorithms for causal inference, domain adaptation, class imbalance, and off-policy reinforcement learning.
Learning a smooth kernel regularizer for convolutional neural networks
We propose a smooth kernel regularizer that encourages spatial correlations in convolution kernel weights.