Learning Activation Functions to Improve Deep Neural Networks

21 Dec 2014  ·  Forest Agostinelli, Matthew Hoffman, Peter Sadowski, Pierre Baldi ·

Artificial neural networks typically have a fixed, non-linear activation function at each neuron. We have designed a novel form of piecewise linear activation function that is learned independently for each neuron using gradient descent. With this adaptive activation function, we are able to improve upon deep neural network architectures composed of static rectified linear units, achieving state-of-the-art performance on CIFAR-10 (7.51%), CIFAR-100 (30.83%), and a benchmark from high-energy physics involving Higgs boson decay modes.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Image Classification CIFAR-10 NiN+APL Percentage correct 92.5 # 162
Image Classification CIFAR-100 NiN+APL Percentage correct 69.2 # 166

Methods


No methods listed for this paper. Add relevant methods here