Cross-Lingual Natural Language Inference
17 papers with code • 1 benchmarks • 2 datasets
Using data and models available for one language for which ample such resources are available (e.g., English) to solve a natural language inference task in another, commonly more low-resource, language.
Libraries
Use these libraries to find Cross-Lingual Natural Language Inference models and implementationsMost implemented papers
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features.
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts.
XNLI: Evaluating Cross-lingual Sentence Representations
State-of-the-art natural language processing systems rely on supervision in the form of annotated data to learn competent models.
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Most widely-used pre-trained language models operate on sequences of tokens corresponding to word or subword units.
Better Fine-Tuning by Reducing Representational Collapse
Although widely adopted, existing approaches for fine-tuning pre-trained language models have been shown to be unstable across hyper-parameter settings, motivating recent work on trust region methods.
Rethinking embedding coupling in pre-trained language models
We re-evaluate the standard practice of sharing weights between input and output embeddings in state-of-the-art pre-trained language models.
Meemi: A Simple Method for Post-processing and Integrating Cross-lingual Word Embeddings
While monolingual word embeddings encode information about words in the context of a particular language, cross-lingual embeddings define a multilingual space where word embeddings from two or more languages are integrated together.
Language Embeddings for Typology and Cross-lingual Transfer Learning
Cross-lingual language tasks typically require a substantial amount of annotated data or parallel translation data.
PARADISE: Exploiting Parallel Data for Multilingual Sequence-to-Sequence Pretraining
Despite the success of multilingual sequence-to-sequence pretraining, most existing approaches rely on monolingual corpora, and do not make use of the strong cross-lingual signal contained in parallel data.