XLM-R

91 papers with code • 0 benchmarks • 1 datasets

XLM-R

Libraries

Use these libraries to find XLM-R models and implementations

Datasets


Most implemented papers

MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers

PaddlePaddle/PaddleNLP Findings (ACL) 2021

We generalize deep self-attention distillation in MiniLM (Wang et al., 2020) by only using self-attention relation distillation for task-agnostic compression of pretrained Transformers.

XeroAlign: Zero-Shot Cross-lingual Transformer Alignment

huawei-noah/noah-research Findings (ACL) 2021

The introduction of pretrained cross-lingual language models brought decisive improvements to multilingual NLP tasks.

DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing

microsoft/DeBERTa 18 Nov 2021

We thus propose a new gradient-disentangled embedding sharing method that avoids the tug-of-war dynamics, improving both training efficiency and the quality of the pre-trained model.

X$^2$-VLM: All-In-One Pre-trained Model For Vision-Language Tasks

zengyan-97/x2-vlm 22 Nov 2022

Vision language pre-training aims to learn alignments between vision and language from a large amount of data.

XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models

facebook/xlm-v-base 25 Jan 2023

Large multilingual language models typically rely on a single vocabulary shared across 100+ languages.

GreekBART: The First Pretrained Greek Sequence-to-Sequence Model

iakovosevdaimon/greekbart 3 Apr 2023

In addition, we examine its performance on two NLG tasks from GreekSUM, a newly introduced summarization dataset for the Greek language.

DUMB: A Benchmark for Smart Evaluation of Dutch Models

wietsedv/dumb 22 May 2023

The benchmark includes a diverse set of datasets for low-, medium- and high-resource tasks.

FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models

konstantinjdobler/focus 23 May 2023

However, if we want to use a new tokenizer specialized for the target language, we cannot transfer the source model's embedding matrix.

PhoBERT: Pre-trained language models for Vietnamese

VinAIResearch/PhoBERT Findings of the Association for Computational Linguistics 2020

We present PhoBERT with two versions, PhoBERT-base and PhoBERT-large, the first public large-scale monolingual language models pre-trained for Vietnamese.

Inducing Language-Agnostic Multilingual Representations

AIPHES/Language-Agnostic-Contextualized-Encoders Joint Conference on Lexical and Computational Semantics 2021

Cross-lingual representations have the potential to make NLP techniques available to the vast majority of languages in the world.