Cross-Lingual Natural Language Inference
16 papers with code • 4 benchmarks • 2 datasets
Using data and models available for one language for which ample such resources are available (e.g., English) to solve a natural language inference task in another, commonly more low-resource, language.
Libraries
Use these libraries to find Cross-Lingual Natural Language Inference models and implementationsLatest papers
Do Multilingual Language Models Think Better in English?
In this work, we introduce a new approach called self-translate, which overcomes the need of an external translation system by leveraging the few-shot translation capabilities of multilingual language models.
Enhancing Cross-lingual Natural Language Inference by Soft Prompting with Multilingual Verbalizer
In this paper, we propose a novel Soft prompt learning framework with the Multilingual Verbalizer (SoftMV) for XNLI.
Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters
We took natural language processing (NLP) as an example to show how Nebula-I works in different training phases that include: a) pre-training a multilingual language model using two remote clusters; and b) fine-tuning a machine translation model using knowledge distilled from pre-trained models, which run through the most popular paradigm of recent deep learning.
Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation and Understanding
For multilingual sequence-to-sequence pretrained language models (multilingual Seq2Seq PLMs), e. g. mBART, the self-supervised pretraining task is trained on a wide range of monolingual languages, e. g. 25 languages from CommonCrawl, while the downstream cross-lingual tasks generally progress on a bilingual language subset, e. g. English-German, making there exists the data discrepancy, namely domain discrepancy, and cross-lingual learning objective discrepancy, namely task discrepancy, between the pretraining and finetuning stages.
mGPT: Few-Shot Learners Go Multilingual
Recent studies report that autoregressive language models can successfully solve many NLP tasks via zero- and few-shot learning paradigms, which opens up new possibilities for using the pre-trained language models.
Subword Mapping and Anchoring across Languages
State-of-the-art multilingual systems rely on shared vocabularies that sufficiently cover all considered languages.
PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining
Despite the success of multilingual sequence-to-sequence pretraining, most existing approaches rely on monolingual corpora, and do not make use of the strong cross-lingual signal contained in parallel data.
Language Embeddings for Typology and Cross-lingual Transfer Learning
Cross-lingual language tasks typically require a substantial amount of annotated data or parallel translation data.
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Most widely-used pre-trained language models operate on sequences of tokens corresponding to word or subword units.
Rethinking embedding coupling in pre-trained language models
We re-evaluate the standard practice of sharing weights between input and output embeddings in state-of-the-art pre-trained language models.