Activation Functions

# Parameterized ReLU

Introduced by He et al. in Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

A Parametric Rectified Linear Unit, or PReLU, is an activation function that generalizes the traditional rectified unit with a slope for negative values. Formally:

$$f\left(y_{i}\right) = y_{i} \text{ if } y_{i} \ge 0$$ $$f\left(y_{i}\right) = a_{i}y_{i} \text{ if } y_{i} \leq 0$$

The intuition is that different layers may require different types of nonlinearity. Indeed the authors find in experiments with convolutional neural networks that PReLus for the initial layer have more positive slopes, i.e. closer to linear. Since the filters of the first layers are Gabor-like filters such as edge or texture detectors, this shows a circumstance where positive and negative responses of filters are respected. In contrast the authors find deeper layers have smaller coefficients, suggesting the model becomes more discriminative at later layers (while it wants to retain more information at earlier layers).

#### Papers

Paper Code Results Date Stars