Eligibility Trace

An Eligibility Trace is a memory vector $\textbf{z}_{t} \in \mathbb{R}^{d}$ that parallels the long-term weight vector $\textbf{w}_{t} \in \mathbb{R}^{d}$. The idea is that when a component of $\textbf{w}_{t}$ participates in producing an estimated value, the corresponding component of $\textbf{z}_{t}$ is bumped up and then begins to fade away. Learning will then occur in that component of $\textbf{w}_{t}$ if a nonzero TD error occurs before the trade falls back to zero. The trace-decay parameter $\lambda \in \left[0, 1\right]$ determines the rate at which the trace falls.

Intuitively, they tackle the credit assignment problem by capturing both a frequency heuristic - states that are visited more often deserve more credit - and a recency heuristic - states that are visited more recently deserve more credit.

$$E_{0}\left(s\right) = 0 $$ $$E_{t}\left(s\right) = \gamma\lambda{E}_{t-1}\left(s\right) + \textbf{1}\left(S_{t} = s\right) $$

Source: Sutton and Barto, Reinforcement Learning, 2nd Edition

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Reinforcement Learning (RL)	5	55.56%
Meta-Learning	3	33.33%
Atari Games	1	11.11%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Eligibility Traces