Policy Gradient Methods

Deterministic Policy Gradient

Deterministic Policy Gradient, or DPG, is a policy gradient method for reinforcement learning. Instead of the policy function $\pi\left(.\mid{s}\right)$ being modeled as a probability distribution, DPG considers and calculates gradients for a deterministic policy $a = \mu_{theta}\left(s\right)$.


Paper Code Results Date Stars


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign