Natural language inference is the task of determining whether a "hypothesis" is true (entailment), false (contradiction), or undetermined (neutral) given a "premise".
Example:
Premise | Label | Hypothesis |
---|---|---|
A man inspects the uniform of a figure in some East Asian country. | contradiction | The man is sleeping. |
An older and younger man smiling. | neutral | Two men are smiling and laughing at the cats playing on the floor. |
A soccer game with multiple males playing. | entailment | Some men are playing a sport. |
Finally, we are releasing our multilingual sparse word representations for the 27 typologically diverse set of languages that we conducted our various experiments on.
DEPENDENCY PARSING DOCUMENT CLASSIFICATION NATURAL LANGUAGE INFERENCE
Current state-of-the-art results in multilingual natural language inference (NLI) are based on tuning XLM (a pre-trained polyglot language model) separately for each language involved, resulting in multiple models.
LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE SENTENCE EMBEDDINGS
However, inference in these large-capacity models is prohibitively slow and expensive.
Recently, the pre-trained language model, BERT (and its robustly optimized version RoBERTa), has attracted a lot of attention in natural language understanding (NLU), and achieved state-of-the-art accuracy in various NLU tasks, such as sentiment classification, natural language inference, semantic textual similarity and question answering.
LANGUAGE MODELLING NATURAL LANGUAGE INFERENCE QUESTION ANSWERING SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS
In natural language inference, the semantics of some words do not affect the inference.
However, while machine performance on these benchmarks advances rapidly --- often surpassing human performance --- it still struggles on the target tasks in the wild.
Recent work has exhibited the surprising cross-lingual abilities of multilingual BERT (M-BERT) -- surprising since it is trained without any cross-lingual objective and with no aligned data.
Large datasets on natural language inference are a potentially valuable resource for inducing semantic representations of natural language sentences.
There have been several studies recently showing that strong natural language understanding (NLU) models are prone to relying on unwanted dataset biases without learning the underlying task, resulting in models which fail to generalize to out-of-domain datasets, and are likely to perform poorly in real-world scenarios.
Distributionally robust optimization (DRO) provides an approach for learning models that instead minimize worst-case training loss over a set of pre-defined groups.