Policy Gradient Methods

Reinforcement Learning • 24 methods

Policy Gradient Methods try to optimize the policy function directly in reinforcement learning. This contrasts with, for example, Q-Learning, where the policy manifests itself as maximizing a value function. Below you can find a continuously updating catalog of policy gradient methods.

Method Year Papers
2017 598
2015 186
1999 157
2018 85
2016 70
2015 69
2016 48
2018 43
2017 33
2018 15
2014 13
2018 10
2016 10
2018 6
2017 2
2017 2
2020 2
2017 1
2018 1
2020 1
2021 1
2021 1
2000 1