Mixing ADAM and SGD: a Combined Optimization Method

Optimization methods (optimizers) get special attention for the efficient training of neural networks in the field of deep learning. In literature there are many papers that compare neural models trained with the use of different optimizers... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Stochastic Optimization AG News Bert Accuracy (mean) 93.86 # 1
Accuracy (max) 93.99 # 1
Stochastic Optimization CIFAR-10 Resnet18 Accuracy (mean) 85.89 # 1
Accuracy (max) 86.85 # 1
Stochastic Optimization CIFAR-10 Resnet34 Accuracy (mean) 85.75 # 2
Accuracy (max) 86.14 # 2
Stochastic Optimization CIFAR-100 Resnet18 Accuracy (mean) 58.01 # 1
Accuracy (max) 58.48 # 1
Stochastic Optimization CIFAR-100 Resnet34 Accuracy (mean) 53.06 # 2
Accuracy (max) 54.5 # 2
Stochastic Optimization CoLA Bert Accuracy (mean) 87.66 # 1
Accuracy (max) 86.34 # 1

Methods used in the Paper


METHOD TYPE
MAS
Stochastic Optimization
SGD
Stochastic Optimization
Adam
Stochastic Optimization