Transformers

XLM

Introduced by Lample et al. in Cross-lingual Language Model Pretraining

XLM is a Transformer based architecture that is pre-trained using one of three language modelling objectives:

  1. Causal Language Modeling - models the probability of a word given the previous words in a sentence.
  2. Masked Language Modeling - the masked language modeling objective of BERT.
  3. Translation Language Modeling - a (new) translation language modeling objective for improving cross-lingual pre-training.

The authors find that both the CLM and MLM approaches provide strong cross-lingual features that can be used for pretraining models.

Source: Cross-lingual Language Model Pretraining

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Language Modelling 17 7.80%
Translation 15 6.88%
Language Modeling 14 6.42%
Sentence 11 5.05%
Machine Translation 10 4.59%
XLM-R 8 3.67%
Cross-Lingual Transfer 8 3.67%
Question Answering 8 3.67%
Retrieval 6 2.75%

Categories