Policy Gradient Methods

Reinforcement Learning • 23 methods

Policy Gradient Methods try to optimize the policy function directly in reinforcement learning. This contrasts with, for example Q-Learning, where the policy manifests itself as maximizing a value function. Below you can find a continuously updating catalogue of policy gradient methods.

Method Year Papers
2017 157
2015 114
1999 112
2018 46
2015 45
2016 45
2016 38
2018 26
2017 19
2014 11
2018 8
2018 5
2016 4
2018 4
2020 2
2017 1
2017 1
2017 1
2018 1
2020 1
2021 1
2021 1