Pretrained Multilingual Language Models
12 papers with code • 0 benchmarks • 1 datasets
Benchmarks
These leaderboards are used to track progress in Pretrained Multilingual Language Models
Most implemented papers
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance.
Investigating Math Word Problems using Pretrained Multilingual Language Models
In this paper, we revisit math word problems~(MWPs) from the cross-lingual and multilingual perspective.
Specializing Multilingual Language Models: An Empirical Study
Pretrained multilingual language models have become a common tool in transferring NLP capabilities to low-resource languages, often with adaptations.
Improving Word Translation via Two-Stage Contrastive Learning
As Stage C1, we propose to refine standard cross-lingual linear maps between static word embeddings (WEs) via a contrastive learning objective; we also show how to integrate it into the self-learning procedure for even more refined cross-lingual maps.
Improving Word Translation via Two-Stage Contrastive Learning
At Stage C1, we propose to refine standard cross-lingual linear maps between static word embeddings (WEs) via a contrastive learning objective; we also show how to integrate it into the self-learning procedure for even more refined cross-lingual maps.
To Adapt or to Fine-tune: A Case Study on Abstractive Summarization
Recent advances in the field of abstractive summarization leverage pre-trained language models rather than train a model from scratch.
Robustification of Multilingual Language Models to Real-world Noise in Crosslingual Zero-shot Settings with Robust Contrastive Pretraining
To benchmark the performance of pretrained multilingual language models, we construct noisy datasets covering five languages and four NLP tasks and observe a clear gap in the performance between clean and noisy data in the zero-shot cross-lingual setting.
Are Pretrained Multilingual Models Equally Fair Across Languages?
Pretrained multilingual language models can help bridge the digital language divide, enabling high-quality NLP models for lower resourced languages.
Language Agnostic Multilingual Information Retrieval with Contrastive Learning
Multilingual information retrieval (IR) is challenging since annotated training data is costly to obtain in many languages.
Improving Bilingual Lexicon Induction with Cross-Encoder Reranking
This crucial step is done via 1) creating a word similarity dataset, comprising positive word pairs (i. e., true translations) and hard negative pairs induced from the original CLWE space, and then 2) fine-tuning an mPLM (e. g., mBERT or XLM-R) in a cross-encoder manner to predict the similarity scores.