Natural language inference is the task of determining whether a "hypothesis" is true (entailment), false (contradiction), or undetermined (neutral) given a "premise".
|A man inspects the uniform of a figure in some East Asian country.||contradiction||The man is sleeping.|
|An older and younger man smiling.||neutral||Two men are smiling and laughing at the cats playing on the floor.|
|A soccer game with multiple males playing.||entailment||Some men are playing a sport.|
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
Finally, we are releasing our multilingual sparse word representations for the 27 typologically diverse set of languages that we conducted our various experiments on.
Current state-of-the-art results in multilingual natural language inference (NLI) are based on tuning XLM (a pre-trained polyglot language model) separately for each language involved, resulting in multiple models.
However, inference in these large-capacity models is prohibitively slow and expensive.
Recently, the pre-trained language model, BERT (and its robustly optimized version RoBERTa), has attracted a lot of attention in natural language understanding (NLU), and achieved state-of-the-art accuracy in various NLU tasks, such as sentiment classification, natural language inference, semantic textual similarity and question answering.
In natural language inference, the semantics of some words do not affect the inference.
Recent work has exhibited the surprising cross-lingual abilities of multilingual BERT (M-BERT) -- surprising since it is trained without any cross-lingual objective and with no aligned data.
Large datasets on natural language inference are a potentially valuable resource for inducing semantic representations of natural language sentences.
There have been several studies recently showing that strong natural language understanding (NLU) models are prone to relying on unwanted dataset biases without learning the underlying task, resulting in models which fail to generalize to out-of-domain datasets, and are likely to perform poorly in real-world scenarios.