Activation Functions

SwiGLU

Introduced by Shazeer in GLU Variants Improve Transformer

SwiGLU is an activation function which is a variant of GLU. The definition is as follows:

$$\text{SwiGLU}\left(x, W, V, b, c, \beta\right) = \text{Swish}_{\beta}\left(xW + b\right) \otimes \left(xV + c\right)$$

Papers

Paper Code Results Date Stars

Components

Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign