Stochastic Optimization

Forward gradient

Introduced by Baydin et al. in Gradients without Backpropagation

Forward gradients are unbiased estimators of the gradient $\nabla f(\theta)$ for a function $f: \mathbb{R}^n \rightarrow \mathbb{R}$, given by $g(\theta) = \langle \nabla f(\theta) , v \rangle v$.

Here $v = (v_1, \ldots, v_n)$ is a random vector, which must satisfy the following conditions in order for $g(\theta)$ to be an unbiased estimator of $\nabla f(\theta)$

  • $v_i \perp v_j$ for all $i \neq j$
  • $\mathbb{E}[v_i] = 0$ for all $i$
  • $\mathbb{V}[v_i] = 1$ for all $i$

Forward gradients can be computed with a single jvp (Jacobian Vector Product), which enables the use of the forward mode of autodifferentiation instead of the usual reverse mode, which has worse computational characteristics.

Source: Gradients without Backpropagation

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Benchmarking 1 20.00%
Imitation Learning 1 20.00%
Offline RL 1 20.00%
Reinforcement Learning (RL) 1 20.00%
Memorization 1 20.00%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories