XLM-R
93 papers with code • 0 benchmarks • 1 datasets
XLM-R
Benchmarks
These leaderboards are used to track progress in XLM-R
Libraries
Use these libraries to find XLM-R models and implementationsMost implemented papers
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
We generalize deep self-attention distillation in MiniLM (Wang et al., 2020) by only using self-attention relation distillation for task-agnostic compression of pretrained Transformers.
XeroAlign: Zero-Shot Cross-lingual Transformer Alignment
The introduction of pretrained cross-lingual language models brought decisive improvements to multilingual NLP tasks.
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
We thus propose a new gradient-disentangled embedding sharing method that avoids the tug-of-war dynamics, improving both training efficiency and the quality of the pre-trained model.
X$^2$-VLM: All-In-One Pre-trained Model For Vision-Language Tasks
Vision language pre-training aims to learn alignments between vision and language from a large amount of data.
XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models
Large multilingual language models typically rely on a single vocabulary shared across 100+ languages.
GreekBART: The First Pretrained Greek Sequence-to-Sequence Model
In addition, we examine its performance on two NLG tasks from GreekSUM, a newly introduced summarization dataset for the Greek language.
DUMB: A Benchmark for Smart Evaluation of Dutch Models
The benchmark includes a diverse set of datasets for low-, medium- and high-resource tasks.
FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models
However, if we want to use a new tokenizer specialized for the target language, we cannot transfer the source model's embedding matrix.
PhoBERT: Pre-trained language models for Vietnamese
We present PhoBERT with two versions, PhoBERT-base and PhoBERT-large, the first public large-scale monolingual language models pre-trained for Vietnamese.
Inducing Language-Agnostic Multilingual Representations
Cross-lingual representations have the potential to make NLP techniques available to the vast majority of languages in the world.