Cross-Lingual Transfer

129 papers with code • 1 benchmarks • 14 datasets

Cross-lingual transfer refers to transfer learning using data and models available for one language for which ample such resources are available (e.g., English) to solve tasks in another, commonly more low-resource, language.

Greatest papers with code

Unsupervised Cross-lingual Representation Learning at Scale

huggingface/transformers ACL 2020

We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale.

Cross-Lingual Transfer Language Modelling +2

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

microsoft/unilm NAACL 2021

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts.

Contrastive Learning Cross-Lingual Transfer +1

Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

facebookresearch/LASER TACL 2019

We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts.

Cross-Lingual Bitext Mining Cross-Lingual Document Classification +5

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation

google-research/xtreme ICML 2020

However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.

Zero-Shot Cross-Lingual Transfer

XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation

google-research/xtreme EMNLP 2021

While a sizeable gap to human-level performance remains, improvements have been easier to achieve in some tasks than in others.

Cross-Lingual Transfer Language understanding +2

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization

google-research/xtreme 24 Mar 2020

However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.

Cross-Lingual Transfer

Learning Compact Metrics for MT

google-research/bleurt EMNLP 2021

Recent developments in machine translation and multilingual text generation have led researchers to adopt trained metrics such as COMET or BLEURT, which treat evaluation as a regression problem and use representations from multilingual pre-trained models such as XLM-RoBERTa or mBERT.

Cross-Lingual Transfer Fine-tuning +5

Don't Just Scratch the Surface: Enhancing Word Representations for Korean with Hanja

shin285/KOMORAN IJCNLP 2019

We propose a simple yet effective approach for improving Korean word representations using additional linguistic annotation (i. e. Hanja).

Cross-Lingual Transfer Transfer Learning

Word Alignment by Fine-tuning Embeddings on Parallel Corpora

neulab/awesome-align EACL 2021

In addition, we demonstrate that we are able to train multilingual word aligners that can obtain robust performance on different language pairs.

Cross-Lingual Transfer Fine-tuning +3

Cross-Lingual Natural Language Generation via Pre-Training

CZWin32768/xnlg 23 Sep 2019

In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages.

Abstractive Text Summarization Machine Translation +4