Cross-Lingual Transfer

274 papers with code • 1 benchmarks • 14 datasets

Cross-lingual transfer refers to transfer learning using data and models available for one language for which ample such resources are available (e.g., English) to solve tasks in another, commonly more low-resource, language.


Use these libraries to find Cross-Lingual Transfer models and implementations
2 papers
2 papers

Most implemented papers

Unsupervised Cross-lingual Representation Learning at Scale

facebookresearch/XLM ACL 2020

We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale.

Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

facebookresearch/LASER TACL 2019

We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts.

Unsupervised Dense Information Retrieval with Contrastive Learning

facebookresearch/contriever 16 Dec 2021

In this work, we explore the limits of contrastive learning as a way to train unsupervised dense retrievers and show that it leads to strong performance in various retrieval settings.

Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on POS Tagging for Non-Standardized Languages

mainlp/noisydialect 20 Apr 2023

This can for instance be observed when finetuning PLMs on one language and evaluating them on data in a closely related language variety with no standardized orthography.

Pushing the Limits of Low-Resource Morphological Inflection

antonisa/inflection IJCNLP 2019

Recent years have seen exceptional strides in the task of automatic morphological inflection generation.

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization

google-research/xtreme 24 Mar 2020

However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

microsoft/unilm NAACL 2021

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts.

Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings

ikergarcia1996/Easy-Translate 23 Oct 2022

Zero-resource cross-lingual transfer approaches aim to apply supervised models from a source language to unlabelled target languages.

Don't Just Scratch the Surface: Enhancing Word Representations for Korean with Hanja

shin285/KOMORAN IJCNLP 2019

We propose a simple yet effective approach for improving Korean word representations using additional linguistic annotation (i. e. Hanja).

End-to-End Slot Alignment and Recognition for Cross-Lingual NLU

amazon-research/multiatis EMNLP 2020

We introduce MultiATIS++, a new multilingual NLU corpus that extends the Multilingual ATIS corpus to nine languages across four language families, and evaluate our method using the corpus.