DDPG, or Deep Deterministic Policy Gradient, is an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. It combines the actor-critic approach with insights from DQNs: in particular, the insights that 1) the network is trained off-policy with samples from a replay buffer to minimize correlations between samples, and 2) the network is trained with a target Q network to give consistent targets during temporal difference backups. DDPG makes use of the same ideas along with batch normalization.
Source: Continuous control with deep reinforcement learningPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Reinforcement Learning (RL) | 128 | 25.40% |
Deep Reinforcement Learning | 99 | 19.64% |
Reinforcement Learning | 99 | 19.64% |
Continuous Control | 36 | 7.14% |
OpenAI Gym | 15 | 2.98% |
Decision Making | 13 | 2.58% |
Management | 12 | 2.38% |
Autonomous Driving | 7 | 1.39% |
Multi-agent Reinforcement Learning | 6 | 1.19% |