D4PG, or Distributed Distributional DDPG, is a policy gradient algorithm that extends upon the DDPG. The improvements include a distributional updates to the DDPG algorithm, combined with the use of multiple distributed workers all writing into the same replay table. The biggest performance gain of other simpler changes was the use of $N$-step returns. The authors found that the use of prioritized experience replay was less crucial to the overall D4PG algorithm especially on harder problems.
Source: Distributed Distributional Deterministic Policy GradientsPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Reinforcement Learning (RL) | 7 | 25.93% |
Reinforcement Learning | 4 | 14.81% |
Continuous Control | 4 | 14.81% |
Deep Reinforcement Learning | 3 | 11.11% |
OpenAI Gym | 3 | 11.11% |
Distributional Reinforcement Learning | 2 | 7.41% |
quantile regression | 1 | 3.70% |
Benchmarking | 1 | 3.70% |
BIG-bench Machine Learning | 1 | 3.70% |
Component | Type |
|
---|---|---|
![]() |
Stochastic Optimization | |
![]() |
Normalization | |
![]() |
Value Function Estimation | |
![]() |
Replay Memory |