Deep Deterministic Policy Gradient

Introduced by Lillicrap et al. in Continuous control with deep reinforcement learning

DDPG, or Deep Deterministic Policy Gradient, is an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. It combines the actor-critic approach with insights from DQNs: in particular, the insights that 1) the network is trained off-policy with samples from a replay buffer to minimize correlations between samples, and 2) the network is trained with a target Q network to give consistent targets during temporal difference backups. DDPG makes use of the same ideas along with batch normalization.

Source: Continuous control with deep reinforcement learning

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Reinforcement Learning (RL)	125	44.17%
Continuous Control	33	11.66%
OpenAI Gym	14	4.95%
Decision Making	12	4.24%
Management	12	4.24%
energy management	6	2.12%
Autonomous Driving	6	2.12%
Multi-agent Reinforcement Learning	5	1.77%
Meta-Learning	4	1.41%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
Adam	Stochastic Optimization
Batch Normalization	Normalization
Convolution	Convolutions
Dense Connections	Feedforward Networks
Experience Replay	Replay Memory
ReLU	Activation Functions
Weight Decay	Regularization

Categories

Add Remove

Policy Gradient Methods