Policy Gradient Methods

Reinforcement Learning • 24 methods

Policy Gradient Methods try to optimize the policy function directly in reinforcement learning. This contrasts with, for example, Q-Learning, where the policy manifests itself as maximizing a value function. Below you can find a continuously updating catalog of policy gradient methods.

Method Year Papers
2017 214
2015 145
1999 133
2018 54
2016 53
2015 51
2016 42
2018 34
2017 25
2014 13
2018 10
2016 8
2018 6
2018 6
2017 2
2017 2
2017 1
2018 1
2020 1
2020 1
2021 1
2021 1
2000 1