Breaking the Activation Function Bottleneck through Adaptive Parameterization

Standard neural network architectures are non-linear only by virtue of a simple element-wise activation function, making them both brittle and excessively large. In this paper, we consider methods for making the feed-forward layer more flexible while preserving its basic structure... (read more)

PDF Abstract NeurIPS 2018 PDF NeurIPS 2018 Abstract

Results from the Paper

Methods used in the Paper