Cross-Lingual Transfer

193 papers with code • 1 benchmarks • 15 datasets

Cross-lingual transfer refers to transfer learning using data and models available for one language for which ample such resources are available (e.g., English) to solve tasks in another, commonly more low-resource, language.

Libraries

Use these libraries to find Cross-Lingual Transfer models and implementations
2 papers
190
2 papers
116

Most implemented papers

Unsupervised Cross-lingual Representation Learning at Scale

facebookresearch/XLM ACL 2020

We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale.

Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

facebookresearch/LASER TACL 2019

We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts.

Pushing the Limits of Low-Resource Morphological Inflection

antonisa/inflection IJCNLP 2019

Recent years have seen exceptional strides in the task of automatic morphological inflection generation.

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training

microsoft/unilm NAACL 2021

In this work, we present an information-theoretic framework that formulates cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts.

Don't Just Scratch the Surface: Enhancing Word Representations for Korean with Hanja

shin285/KOMORAN IJCNLP 2019

We propose a simple yet effective approach for improving Korean word representations using additional linguistic annotation (i. e. Hanja).

XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization

google-research/xtreme 24 Mar 2020

However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.

MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer

cambridgeltl/xcopa EMNLP 2020

The main goal behind state-of-the-art pre-trained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot cross-lingual transfer.

Word Alignment by Fine-tuning Embeddings on Parallel Corpora

neulab/awesome-align EACL 2021

In addition, we demonstrate that we are able to train multilingual word aligners that can obtain robust performance on different language pairs.

Unsupervised Dense Information Retrieval with Contrastive Learning

facebookresearch/contriever 16 Dec 2021

In this work, we explore the limits of contrastive learning as a way to train unsupervised dense retrievers and show that it leads to strong performance in various retrieval settings.

Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification

ccsasuke/adan TACL 2018

To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exists.