Output Functions

Mixture of Softmaxes

Introduced by Yang et al. in Breaking the Softmax Bottleneck: A High-Rank RNN Language Model

Mixture of Softmaxes performs $K$ different softmaxes and mixes them. The motivation is that the traditional softmax suffers from a softmax bottleneck, i.e. the expressiveness of the conditional probability we can model is constrained by the combination of a dot product and the softmax. By using a mixture of softmaxes, we can model the conditional probability more expressively.

Source: Breaking the Softmax Bottleneck: A High-Rank RNN Language Model

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Language Modeling 2 18.18%
Language Modelling 2 18.18%
Machine Translation 2 18.18%
Translation 2 18.18%
Tree Decomposition 1 9.09%
Image Captioning 1 9.09%
Text Generation 1 9.09%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories