Entropy Regularization

Introduced by Mnih et al. in Asynchronous Methods for Deep Reinforcement Learning

Entropy Regularization is a type of regularization used in reinforcement learning. For on-policy policy gradient based methods like A3C, the same mutual reinforcement behaviour leads to a highly-peaked $\pi\left(a\mid{s}\right)$ towards a few actions or action sequences, since it is easier for the actor and critic to overoptimise to a small portion of the environment. To reduce this problem, entropy regularization adds an entropy term to the loss to promote action diversity:

$$H(X) = -\sum\pi\left(x\right)\log\left(\pi\left(x\right)\right) $$

Image Credit: Wikipedia

Source: Asynchronous Methods for Deep Reinforcement Learning

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Reinforcement Learning (RL)	143	17.31%
Autonomous Driving	114	13.80%
Autonomous Vehicles	41	4.96%
Imitation Learning	32	3.87%
Decision Making	27	3.27%
Object Detection	24	2.91%
Language Modelling	18	2.18%
Semantic Segmentation	18	2.18%
Continuous Control	15	1.82%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Regularization