Trainable Activations for Image Classification

Preprints 2023  ·  Evgenii Pishchik ·

Non-linear activation functions are one of the main parts of deep neural network architectures. The choice of the activation function can affect model speed, performance and convergence. Most popular activation functions don't have any trainable parameters and don't alter during the training. We propose different activation functions with and without trainable parameters. Said activation functions have a number of advantages and disadvantages. We'll be testing the performance of said activation functions and comparing the results with widely known activation function ReLU. We assume that the activation functions with trainable parameters can outperform functions without ones, because the trainable parameters allow the model to "select'' the type of each of the activation functions itself, however, this strongly depends on the architecture of the deep neural network and the activation function itself.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Image Classification CIFAR-10 ResNet-8 (Trainable Activations) Percentage correct 86.5 # 201
PARAMS 0.075M # 1
Top-1 Accuracy 86.5 # 34
Image Classification CIFAR-10 ResNet-56 (Trainable Activations) Percentage correct 88.8 # 195
PARAMS 0.853M # 179
Top-1 Accuracy 88.8 # 31
Image Classification CIFAR-10 ResNet-26 (Trainable Activations) Percentage correct 91.1 # 178
PARAMS 0.366M # 168
Top-1 Accuracy 91.1 # 26
Image Classification CIFAR-10 ResNet-20 (Trainable Activations) Percentage correct 90.4 # 187
PARAMS 0.269M # 166
Top-1 Accuracy 90.4 # 29
Image Classification CIFAR-10 ResNet-14 (Trainable Activations) Percentage correct 89.0 # 193
PARAMS 0.172M # 164
Top-1 Accuracy 89.0 # 30
Image Classification CIFAR-10 ResNet-44 (Trainable Activations) Percentage correct 90.5 # 185
PARAMS 0.658M # 176
Top-1 Accuracy 90.5 # 28
Image Classification CIFAR-10 ResNet-32 (Trainable Activations) Percentage correct 90.9 # 179
PARAMS 0.464M # 170
Top-1 Accuracy 90.9 # 27
Image Classification MNIST DNN-3 (Trainable Activations) Percentage error 3.0 # 78
Accuracy 97.0 # 29
Trainable Parameters 80568 # 2
Image Classification MNIST DNN-2 (Trainable Activations) Percentage error 3.6 # 79
Accuracy 96.4 # 30
Trainable Parameters 5500 # 1
Image Classification MNIST DNN-5 (Trainable Activations) Percentage error 2.8 # 77
Accuracy 97.2 # 28
Trainable Parameters 175180 # 92

Methods