Policy Gradient Methods

Reinforcement Learning • 24 methods

Policy Gradient Methods try to optimize the policy function directly in reinforcement learning. This contrasts with, for example, Q-Learning, where the policy manifests itself as maximizing a value function. Below you can find a continuously updating catalog of policy gradient methods.

Method Year Papers
2017 766
2015 201
1999 170
2018 104
2016 77
2015 74
2016 53
2018 52
2017 34
2014 17
2018 15
2018 11
2016 11
2018 6
2020 3
2017 2
2017 2
2017 1
2018 1
2020 1
2021 1
2021 1
2000 1