QHAdam Explained | Papers With Code

Method Name:*

Method Full Name:*

Description with Markdown (optional):

The **Quasi-Hyperbolic Momentum Algorithm (QHM)** is a simple alteration of [momentum SGD](https://paperswithcode.com/method/sgd-with-momentum), averaging a plain [SGD](https://paperswithcode.com/method/sgd) step with a momentum step. **QHAdam** is a QH augmented version of [Adam](https://paperswithcode.com/method/adam), where we replace both of Adam's moment estimators with quasi-hyperbolic terms. QHAdam decouples the momentum term from the current gradient when updating the weights, and decouples the mean squared gradients term from the current squared gradient when updating the weights.

In essence, it is a weighted average of the momentum and plain SGD, weighting the current gradient with an immediate discount factor $v\_{1}$ divided by a weighted average of the mean squared gradients and the current squared gradient, weighting the current squared gradient with an immediate discount factor $v\_{2}$.

$$ \theta\_{t+1, i} = \theta\_{t, i} - \eta\left[\frac{\left(1-v\_{1}\right)\cdot{g\_{t}} + v\_{1}\cdot\hat{m}\_{t}}{\sqrt{\left(1-v\_{2}\right)g^{2}\_{t} + v\_{2}\cdot{\hat{v}\_{t}}} + \epsilon}\right], \forall{t} $$

It is recommended to set $v\_{2} = 1$ and $\beta\_{2}$ same as in Adam.

Code Snippet URL (optional):

Image

Currently: methods/Screen_Shot_2020-05-28_at_8.52.32_PM.png Clear
Change:

Attached collections:

STOCHASTIC OPTIMIZATION

Add:

New collection name:

Top-level area:

Parent collection (if any):

Description (optional):

QHAdam

Papers

Usage Over Time

Components

Categories

Add Remove

QHAdam

Papers

Usage Over Time

Components

Categories Edit Add Remove

Categories

Add Remove