Stochastic Optimization
# Demon CM

Introduced by Chen et al. in Demon: Improved Neural Network Training with Momentum Decay
#### Papers

#### Usage Over Time

####
Categories

**Demon CM**, or **SGD with Momentum and Demon**, is the Demon momentum rule applied to SGD with momentum.

$$ \beta_{t} = \beta_{init}\cdot\frac{\left(1-\frac{t}{T}\right)}{\left(1-\beta_{init}\right) + \beta_{init}\left(1-\frac{t}{T}\right)} $$

$$ \theta_{t+1} = \theta_{t} - \eta{g}_{t} + \beta_{t}v_{t} $$

$$ v_{t+1} = \beta_{t}{v_{t}} - \eta{g_{t}} $$

Source: Demon: Improved Neural Network Training with Momentum DecayPaper | Code | Results | Date | Stars |
---|