Embeddings from Language Models, or ELMo, is a type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus.
A biLM combines both a forward and backward LM. ELMo jointly maximizes the log likelihood of the forward and backward directions. To add ELMo to a supervised model, we freeze the weights of the biLM and then concatenate the ELMo vector $\textbf{ELMO}^{task}_k$ with $\textbf{x}_k$ and pass the ELMO enhanced representation $[\textbf{x}_k; \textbf{ELMO}^{task}_k]$ into the task RNN. Here $\textbf{x}_k$ is a context-independent token representation for each token position.
Image Source: here
Source: Deep contextualized word representationsPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Sentence | 46 | 9.68% |
Language Modelling | 38 | 8.00% |
Sentiment Analysis | 25 | 5.26% |
Named Entity Recognition (NER) | 21 | 4.42% |
NER | 19 | 4.00% |
Text Classification | 15 | 3.16% |
General Classification | 15 | 3.16% |
Question Answering | 14 | 2.95% |
Natural Language Inference | 12 | 2.53% |