Policy Gradient Methods

Reinforcement Learning • 24 methods

Policy Gradient Methods try to optimize the policy function directly in reinforcement learning. This contrasts with, for example, Q-Learning, where the policy manifests itself as maximizing a value function. Below you can find a continuously updating catalog of policy gradient methods.

Method Year Papers
2017 629
2015 190
1999 160
2018 90
2015 71
2016 70
2016 48
2018 45
2017 33
2014 16
2018 15
2016 11
2018 10
2018 6
2017 2
2017 2
2020 2
2017 1
2018 1
2020 1
2021 1
2021 1
2000 1