Policy Gradient Methods

Deep Deterministic Policy Gradient

Introduced by Lillicrap et al. in Continuous control with deep reinforcement learning

DDPG, or Deep Deterministic Policy Gradient, is an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. It combines the actor-critic approach with insights from DQNs: in particular, the insights that 1) the network is trained off-policy with samples from a replay buffer to minimize correlations between samples, and 2) the network is trained with a target Q network to give consistent targets during temporal difference backups. DDPG makes use of the same ideas along with batch normalization.

Source: Continuous control with deep reinforcement learning

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Reinforcement Learning (RL) 125 44.01%
Continuous Control 33 11.62%
OpenAI Gym 15 5.28%
Decision Making 12 4.23%
Management 12 4.23%
energy management 6 2.11%
Autonomous Driving 6 2.11%
Multi-agent Reinforcement Learning 5 1.76%
Meta-Learning 4 1.41%

Categories