$n$-step Returns are used for value function estimation in reinforcement learning. Specifically, for $n$ steps we can write the complete return as:
$$ R_{t}^{(n)} = r_{t+1} + \gamma{r}_{t+2} + \cdots + \gamma^{n-1}_{t+n} + \gamma^{n}V_{t}\left(s_{t+n}\right) $$
We can then write an $n$-step backup, in the style of TD learning, as:
$$ \Delta{V}_{r}\left(s_{t}\right) = \alpha\left[R_{t}^{(n)} - V_{t}\left(s_{t}\right)\right] $$
Multi-step returns often lead to faster learning with suitably tuned $n$.
Image Credit: Sutton and Barto, Reinforcement Learning
Paper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Reinforcement Learning (RL) | 20 | 25.32% |
Deep Reinforcement Learning | 13 | 16.46% |
Reinforcement Learning | 12 | 15.19% |
Atari Games | 6 | 7.59% |
Continuous Control | 4 | 5.06% |
OpenAI Gym | 3 | 3.80% |
Decision Making | 3 | 3.80% |
Distributional Reinforcement Learning | 2 | 2.53% |
Benchmarking | 2 | 2.53% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |