L2 Regularization

28 papers with code • 0 benchmarks • 0 datasets

See Weight Decay.

$L_{2}$ Regularization or Weight Decay, is a regularization technique applied to the weights of a neural network. We minimize a loss function compromising both the primary loss function and a penalty on the $L_{2}$ Norm of the weights:

$$L_{new}\left(w\right) = L_{original}\left(w\right) + \lambda{w^{T}w}$$

where $\lambda$ is a value determining the strength of the penalty (encouraging smaller weights).

Weight decay can be incorporated directly into the weight update rule, rather than just implicitly by defining it through to objective function. Often weight decay refers to the implementation where we specify it directly in the weight update rule (whereas L2 regularization is usually the implementation which is specified in the objective function).

Latest papers with no code

On the Convergence Rate of the Stochastic Gradient Descent (SGD) and application to a modified policy gradient for the Multi Armed Bandit

no code yet • 9 Feb 2024

We present a self-contained proof of the convergence rate of the Stochastic Gradient Descent (SGD) when the learning rate follows an inverse time decays schedule; we next apply the results to the convergence of a modified form of policy gradient Multi-Armed Bandit (MAB) with $L2$ regularization.

An Experiment on Feature Selection using Logistic Regression

no code yet • 31 Jan 2024

We ranked features first with L1 and then with L2, and then compared logistic regression with L1 (LR+L1) against that with L2 (LR+L2) by varying the sizes of the feature sets for each of the two rankings.

Reverse Engineering Deep ReLU Networks An Optimization-based Algorithm

no code yet • 7 Dec 2023

In this research, we present a novel method for reconstructing deep ReLU networks by leveraging convex optimization techniques and a sampling-based approach.

On sparse regression, Lp-regularization, and automated model discovery

no code yet • 9 Oct 2023

With these insights, we demonstrate that Lp regularized constitutive neural networks can simultaneously discover both, interpretable models and physically meaningful parameters.

Maintaining Plasticity in Continual Learning via Regenerative Regularization

no code yet • 23 Aug 2023

In this paper, we propose L2 Init, a simple approach for maintaining plasticity by incorporating in the loss function L2 regularization toward initial parameters.

Electromyography Signal Classification Using Deep Learning

no code yet • 6 May 2023

Having implemented this model, an accuracy of 99 percent is achieved on the test data set.

Maximum margin learning of t-SPNs for cell classification with filtered input

no code yet • 16 Mar 2023

The t-SPN is constructed such that the unnormalized probability is represented as conditional probabilities of a subset of most similar cell classes.

Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition

no code yet • 20 Feb 2023

Furthermore, our proposed combined loss rescaling and weight consolidation methods can support continual learning of an ASR system.

Globally Gated Deep Linear Networks

no code yet • 31 Oct 2022

The rich and diverse behavior of the GGDLNs suggests that they are helpful analytically tractable models of learning single and multiple tasks, in finite-width nonlinear deep networks.

Linking Neural Collapse and L2 Normalization with Improved Out-of-Distribution Detection in Deep Neural Networks

no code yet • 17 Sep 2022

We propose a simple modification to standard ResNet architectures--L2 normalization over feature space--that substantially improves out-of-distribution (OoD) performance on the previously proposed Deep Deterministic Uncertainty (DDU) benchmark.