Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

EMNLP 2017 Alexis ConneauDouwe KielaHolger SchwenkLoic BarraultAntoine Bordes

Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Efforts to obtain embeddings for larger chunks of text, such as sentences, have however not been so successful... (read more)

PDF Abstract

Evaluation results from the paper


Task Dataset Model Metric name Metric value Global rank Compare
Semantic Textual Similarity SentEval InferSent MRPC 76.2/83.1 # 1
Semantic Textual Similarity SentEval InferSent SICK-R 0.884 # 2
Semantic Textual Similarity SentEval InferSent SICK-E 86.3 # 2
Semantic Textual Similarity SentEval InferSent STS 75.8/75.5 # 1
Natural Language Inference SNLI 4096D BiLSTM with max-pooling % Test Accuracy 84.5 # 37
Natural Language Inference SNLI 4096D BiLSTM with max-pooling % Train Accuracy 85.6 # 46
Natural Language Inference SNLI 4096D BiLSTM with max-pooling Parameters 40m # 1
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-French X-CBOW Accuracy 60.3% # 3
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-French X-BiLSTM Accuracy 67.7% # 2
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-German X-CBOW Accuracy 61.0% # 4
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-German X-BiLSTM Accuracy 67.7% # 3
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-Spanish X-BiLSTM Accuracy 68.7% # 3
Cross-Lingual Natural Language Inference XNLI Zero-Shot English-to-Spanish X-CBOW Accuracy 60.7% # 4