Stochastic Optimization

AdaSqrt is a stochastic optimization technique that is motivated by the observation that methods like Adagrad and Adam can be viewed as relaxations of Natural Gradient Descent.

The updates are performed as follows:

$$ t \leftarrow t + 1 $$

$$ \alpha_{t} \leftarrow \sqrt{t} $$

$$ g_{t} \leftarrow \nabla_{\theta}f\left(\theta_{t-1}\right) $$

$$ S_{t} \leftarrow S_{t-1} + g_{t}^{2} $$

$$ \theta_{t+1} \leftarrow \theta_{t} + \eta\frac{\alpha_{t}g_{t}}{S_{t} + \epsilon} $$

Source: Second-order Information in First-order Optimization Methods

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
2D Human Pose Estimation 1 100.00%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories