Value Function Estimation

# Stochastic Dueling Network

Introduced by Wang et al. in Sample Efficient Actor-Critic with Experience Replay

A Stochastic Dueling Network, or SDN, is an architecture for learning a value function $V$. The SDN learns both $V$ and $Q$ off-policy while maintaining consistency between the two estimates. At each time step it outputs a stochastic estimate of $Q$ and a deterministic estimate of $V$.

#### Papers

Paper Code Results Date Stars