Cosine Annealing

Introduced by Loshchilov et al. in SGDR: Stochastic Gradient Descent with Warm Restarts

Cosine Annealing is a type of learning rate schedule that has the effect of starting with a large learning rate that is relatively rapidly decreased to a minimum value before being increased rapidly again. The resetting of the learning rate acts like a simulated restart of the learning process and the re-use of good weights as the starting point of the restart is referred to as a "warm restart" in contrast to a "cold restart" where a new set of small random numbers may be used as a starting point.

$$\eta_{t} = \eta_{min}^{i} + \frac{1}{2}\left(\eta_{max}^{i}-\eta_{min}^{i}\right)\left(1+\cos\left(\frac{T_{cur}}{T_{i}}\pi\right)\right) $$

Where where $\eta_{min}^{i}$ and $ \eta_{max}^{i}$ are ranges for the learning rate, and $T_{cur}$ account for how many epochs have been performed since the last restart.

Text Source: Jason Brownlee

Image Source: Gao Huang

Source: SGDR: Stochastic Gradient Descent with Warm Restarts

Latest Papers

PAPER DATE
PP-YOLO: An Effective and Efficient Implementation of Object Detector
| Xiang LongKaipeng DengGuanzhong WangYang ZhangQingqing DangYuan GaoHui ShenJianguo RenShumin HanErrui DingShilei Wen
2020-07-23
Generative Pretraining from Pixels
| Mark ChenAlec RadfordRewon ChildJeff WuHeewoo JunPrafulla DhariwalDavid LuanIlya Sutskever
2020-07-17
NVAE: A Deep Hierarchical Variational Autoencoder
Arash VahdatJan Kautz
2020-07-08
Inductive Unsupervised Domain Adaptation for Few-Shot Classification via Clustering
| Xin CongBowen YuTingwen LiuShiyao CuiHengzhu TangBin Wang
2020-06-23
Ultra Fast Structure-aware Deep Lane Detection
| Zequn QinHuanyu WangXi Li
2020-04-24
Supervised Contrastive Learning
| Prannay KhoslaPiotr TeterwakChen WangAaron SarnaYonglong TianPhillip IsolaAaron MaschinotCe LiuDilip Krishnan
2020-04-23
YOLOv4: Optimal Speed and Accuracy of Object Detection
| Alexey BochkovskiyChien-Yao WangHong-Yuan Mark Liao
2020-04-23
Designing Network Design Spaces
| Ilija RadosavovicRaj Prateek KosarajuRoss GirshickKaiming HePiotr Dollár
2020-03-30
Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection
| Jianyuan GuoKai HanYunhe WangChao ZhangZhaohui YangHan WuXinghao ChenChang Xu
2020-03-26
GreedyNAS: Towards Fast One-Shot NAS with Greedy Supernet
Shan YouTao HuangMingmin YangFei WangChen QianChangshui Zhang
2020-03-25
Improved Baselines with Momentum Contrastive Learning
| Xinlei ChenHaoqi FanRoss GirshickKaiming He
2020-03-09
What's Hidden in a Randomly Weighted Neural Network?
| Vivek RamanujanMitchell WortsmanAniruddha KembhaviAli FarhadiMohammad Rastegari
2019-11-29
Learning Spatial Fusion for Single-Shot Object Detection
| Songtao LiuDi HuangYunhong Wang
2019-11-21
Towards calibrated and scalable uncertainty representations for neural networks
Nabeel SeedatChristopher Kanan
2019-10-28
RandAugment: Practical automated data augmentation with a reduced search space
| Ekin D. CubukBarret ZophJonathon ShlensQuoc V. Le
2019-09-30
Mish: A Self Regularized Non-Monotonic Neural Activation Function
| Diganta Misra
2019-08-23
SCARLET-NAS: Bridging the gap between Stability and Scalability in Weight-sharing Neural Architecture Search
| Xiangxiang ChuBo ZhangJixiang LiQingyuan LiRuijun Xu
2019-08-16
MoGA: Searching Beyond MobileNetV3
| Xiangxiang ChuBo ZhangRuijun Xu
2019-08-04
Densely Connected Search Space for More Flexible Neural Architecture Search
| Jiemin FangYuzhu SunQian ZhangYuan LiWenyu LiuXinggang Wang
2019-06-23
Attention Augmented Convolutional Networks
| Irwan BelloBarret ZophAshish VaswaniJonathon ShlensQuoc V. Le
2019-04-22
Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution
| Yunpeng ChenHaoqi FanBing XuZhicheng YanYannis KalantidisMarcus RohrbachShuicheng YanJiashi Feng
2019-04-10
Exploring Randomly Wired Neural Networks for Image Recognition
| Saining XieAlexander KirillovRoss GirshickKaiming He
2019-04-02
FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search
| Bichen WuXiaoliang DaiPeizhao ZhangYanghan WangFei SunYiming WuYuandong TianPeter VajdaYangqing JiaKurt Keutzer
2018-12-09
Bag of Tricks for Image Classification with Convolutional Neural Networks
| Tong HeZhi ZhangHang ZhangZhongyue ZhangJunyuan XieMu Li
2018-12-04
Pelee: A Real-Time Object Detection System on Mobile Devices
| Jun WangTanner BohnCharles Ling
2018-12-01
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation
Akhilesh GotmareNitish Shirish KeskarCaiming XiongRichard Socher
2018-10-29
Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition
| Chun-Fu ChenQuanfu FanNeil MallinarTom SercuRogerio Feris
2018-07-10
Using Mode Connectivity for Loss Landscape Analysis
Akhilesh GotmareNitish Shirish KeskarCaiming XiongRichard Socher
2018-06-18
Averaging Weights Leads to Wider Optima and Better Generalization
| Pavel IzmailovDmitrii PodoprikhinTimur GaripovDmitry VetrovAndrew Gordon Wilson
2018-03-14
Regularized Evolution for Image Classifier Architecture Search
| Esteban RealAlok AggarwalYanping HuangQuoc V Le
2018-02-05
SGDR: Stochastic Gradient Descent with Warm Restarts
| Ilya LoshchilovFrank Hutter
2016-08-13

Components

COMPONENT TYPE
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories