Cross-Lingual Natural Language Inference

15 papers with code • 4 benchmarks • 2 datasets

Using data and models available for one language for which ample such resources are available (e.g., English) to solve a natural language inference task in another, commonly more low-resource, language.


Use these libraries to find Cross-Lingual Natural Language Inference models and implementations
2 papers

Most implemented papers

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

google-research/bert NAACL 2019

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

facebookresearch/InferSent EMNLP 2017

Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features.

Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond

facebookresearch/LASER TACL 2019

We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts.

XNLI: Evaluating Cross-lingual Sentence Representations

facebookresearch/XLM EMNLP 2018

State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models.

ByT5: Towards a token-free future with pre-trained byte-to-byte models

google-research/byt5 28 May 2021

Most widely-used pre-trained language models operate on sequences of tokens corresponding to word or subword units.

Better Fine-Tuning by Reducing Representational Collapse

pytorch/fairseq ICLR 2021

Although widely adopted, existing approaches for fine-tuning pre-trained language models have been shown to be unstable across hyper-parameter settings, motivating recent work on trust region methods.

Rethinking embedding coupling in pre-trained language models

PaddlePaddle/PaddleNLP ICLR 2021

We re-evaluate the standard practice of sharing weights between input and output embeddings in state-of-the-art pre-trained language models.

Meemi: A Simple Method for Post-processing and Integrating Cross-lingual Word Embeddings

yeraidm/meemi 16 Oct 2019

While monolingual word embeddings encode information about words in the context of a particular language, cross-lingual embeddings define a multilingual space where word embeddings from two or more languages are integrated together.

FarsTail: A Persian Natural Language Inference Dataset

dml-qom/FarsTail 18 Sep 2020

This dataset, named FarsTail, includes 10, 367 samples which are provided in both the Persian language as well as the indexed format to be useful for non-Persian researchers.

Language Embeddings for Typology and Cross-lingual Transfer Learning

DianDYu/language_embeddings ACL 2021

Cross-lingual language tasks typically require a substantial amount of annotated data or parallel translation data.