MTEB: Massive Text Embedding Benchmark

13 Oct 2022  ·  Niklas Muennighoff, Nouamane Tazi, Loïc Magne, Nils Reimers ·

Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding Benchmark (MTEB). MTEB spans 8 embedding tasks covering a total of 58 datasets and 112 languages. Through the benchmarking of 33 models on MTEB, we establish the most comprehensive benchmark of text embeddings to date. We find that no particular text embedding method dominates across all tasks. This suggests that the field has yet to converge on a universal text embedding method and scale it up sufficiently to provide state-of-the-art results on all embedding tasks. MTEB comes with open-source code and a public leaderboard at https://github.com/embeddings-benchmark/mteb.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Text Clustering MTEB ST5-XXL V-Measure 43.71 # 1
Text Pair Classification MTEB GTR-XL AP 86.13 # 1
Text Retrieval MTEB SGPT-5.8B-msmarco nDCG@10 50.25 # 1
Text Classification MTEB ST5-XXL Accuracy 73.42 # 1
Text Summarization MTEB ST5-XXL Spearman Correlation 30.08 # 11
Text Summarization MTEB ST5-XL Spearman Correlation 29.91 # 12
Text Summarization MTEB ST5-Large Spearman Correlation 29.64 # 15
Text Summarization MTEB ST5-Base Spearman Correlation 31.39 # 2
Text Summarization MTEB GTR-XXL Spearman Correlation 30.64 # 6
Text Summarization MTEB GTR-XL Spearman Correlation 30.21 # 10
Text Summarization MTEB GTR-Base Spearman Correlation 29.67 # 14
Text Summarization MTEB SGPT-BLOOM-7.1B-msmarco Spearman Correlation 24.99 # 24
Text Summarization MTEB SGPT-5.8B-msmarco Spearman Correlation 24.75 # 25
Text Summarization MTEB SGPT-1.3B-msmarco Spearman Correlation 25.44 # 23
Text Summarization MTEB SGPT-125M-nli Spearman Correlation 30.26 # 9
Text Summarization MTEB Ada Similarity Spearman Correlation 26.94 # 21
Text Summarization MTEB MPNet-multilingual Spearman Correlation 31.57 # 1
Text Summarization MTEB MPNet Spearman Correlation 27.49 # 20
Text Summarization MTEB MiniLM-L12-multilingual Spearman Correlation 30.67 # 5
Text Summarization MTEB MiniLM-L12 Spearman Correlation 27.9 # 18
Text Summarization MTEB MiniLM-L6 Spearman Correlation 30.81 # 4
Text Summarization MTEB LASER2 Spearman Correlation 26.8 # 22
Text Summarization MTEB SPECTER Spearman Correlation 27.66 # 19
Text Summarization MTEB Contriever Spearman Correlation 30.36 # 8
Text Summarization MTEB coCondenser-msmarco Spearman Correlation 29.5 # 16
Text Summarization MTEB SimCSE-BERT-sup Spearman Correlation 23.31 # 26
Semantic Textual Similarity MTEB ST5-XXL Spearman Correlation 82.63 # 1
Semantic Textual Similarity MTEB ST5-XL Spearman Correlation 81.66 # 3
Semantic Textual Similarity MTEB ST5-Large Spearman Correlation 81.83 # 2
Semantic Textual Similarity MTEB ST5-Base Spearman Correlation 81.14 # 4
Semantic Textual Similarity MTEB GTR-XXL Spearman Correlation 78.38 # 12
Semantic Textual Similarity MTEB GTR-XL Spearman Correlation 77.8 # 15
Semantic Textual Similarity MTEB GTR-Large Spearman Correlation 78.19 # 13
Semantic Textual Similarity MTEB GTR-Base Spearman Correlation 77.07 # 17
Semantic Textual Similarity MTEB SGPT-BLOOM-7.1B-msmarco Spearman Correlation 77.74 # 16
Semantic Textual Similarity MTEB SGPT-5.8B-msmarco Spearman Correlation 78.1 # 14
Semantic Textual Similarity MTEB SGPT-2.7B-msmarco Spearman Correlation 76.83 # 18
Semantic Textual Similarity MTEB SGPT-1.3B-msmarco Spearman Correlation 75.74 # 20
Semantic Textual Similarity MTEB SGPT-125M-msmarco Spearman Correlation 73.41 # 23
Semantic Textual Similarity MTEB SGPT-5.8B-nli Spearman Correlation 80.53 # 6
Semantic Textual Similarity MTEB SGPT-125M-nli Spearman Correlation 74.71 # 21
Semantic Textual Similarity MTEB Ada Similarity Spearman Correlation 78.6 # 11
Semantic Textual Similarity MTEB MPNet-multilingual Spearman Correlation 80.73 # 5
Semantic Textual Similarity MTEB MPNet Spearman Correlation 80.28 # 7
Semantic Textual Similarity MTEB MiniLM-L12 Spearman Correlation 79.8 # 8
Semantic Textual Similarity MTEB MiniLM-L6 Spearman Correlation 78.92 # 10
Semantic Textual Similarity MTEB LASER2 Spearman Correlation 55.32 # 28
Semantic Textual Similarity MTEB LaBSE Spearman Correlation 70.8 # 24
Semantic Textual Similarity MTEB SPECTER Spearman Correlation 61.02 # 27
Semantic Textual Similarity MTEB coCondenser-msmarco Spearman Correlation 76.47 # 19
Semantic Textual Similarity MTEB SimCSE-BERT-sup Spearman Correlation 79.12 # 9
Text Retrieval MTEB ST5-XXL nDCG@10 42.24 # 11
Text Retrieval MTEB ST5-XL nDCG@10 38.47 # 14
Text Reranking MTEB ST5-XL mAP 54.71 # 9
Text Pair Classification MTEB ST5-XL AP 86.06 # 2
Text Retrieval MTEB ST5-Large nDCG@10 36.71 # 16
Text Retrieval MTEB ST5-Base nDCG@10 33.63 # 18
Text Retrieval MTEB GTR-XXL nDCG@10 48.48 # 2
Text Retrieval MTEB GTR-XL nDCG@10 47.96 # 4
Text Retrieval MTEB GTR-Large nDCG@10 47.42 # 5
Text Retrieval MTEB GTR-Base nDCG@10 44.67 # 7
Text Retrieval MTEB SGPT-BLOOM-7.1B-msmarco nDCG@10 48.21 # 3
Information Retrieval MTEB SGPT-5.8B-msmarco nDCG@10 50.25 # 1
Text Retrieval MTEB SGPT-2.7B-msmarco nDCG@10 46.54 # 6
Text Retrieval MTEB SGPT-1.3B-msmarco nDCG@10 44.49 # 8
Text Retrieval MTEB SGPT-125M-msmarco nDCG@10 37.04 # 15
Text Retrieval MTEB SGPT-5.8B-nli nDCG@10 32.34 # 21
Text Retrieval MTEB SGPT-125M-nli nDCG@10 20.9 # 25
Text Retrieval MTEB MPNet-multilingual nDCG@10 35.34 # 17
Text Retrieval MTEB MPNet nDCG@10 43.81 # 9
Text Retrieval MTEB MiniLM-L12-multilingual nDCG@10 32.45 # 20
Text Retrieval MTEB MiniLM-L12 nDCG@10 42.69 # 10
Text Retrieval MTEB MiniLM-L6 nDCG@10 41.95 # 12
Text Retrieval MTEB LASER2 nDCG@10 7.93 # 30
Text Retrieval MTEB LaBSE nDCG@10 18.99 # 27
Text Retrieval MTEB SPECTER nDCG@10 15.88 # 28
Text Retrieval MTEB Contriever nDCG@10 41.88 # 13
Text Retrieval MTEB coCondenser-msmarco nDCG@10 32.96 # 19
Text Retrieval MTEB SimCSE-BERT-sup nDCG@10 21.82 # 22
Text Summarization MTEB Komninos Spearman Correlation 30.49 # 7
Text Summarization MTEB SimCSE-BERT-unsup Spearman Correlation 31.15 # 3
Text Summarization MTEB BERT Spearman Correlation 29.82 # 13
Text Summarization MTEB Glove Spearman Correlation 28.87 # 17
Semantic Textual Similarity MTEB SimCSE-BERT-unsup Spearman Correlation 74.33 # 22
Semantic Textual Similarity MTEB BERT Spearman Correlation 54.36 # 29
Semantic Textual Similarity MTEB Komninos Spearman Correlation 62.47 # 25
Semantic Textual Similarity MTEB Glove Spearman Correlation 61.85 # 26
Text Retrieval MTEB SimCSE-BERT-unsup nDCG@10 20.29 # 26
Text Retrieval MTEB BERT nDCG@10 10.59 # 29
Text Retrieval MTEB Komninos nDCG@10 21.22 # 24
Text Retrieval MTEB Glove nDCG@10 21.62 # 23
Text Reranking MTEB GTR-XL mAP 55.96 # 6
Text Reranking MTEB ST5-XXL mAP 56.43 # 5
Text Reranking MTEB ST5-Large mAP 54 # 11
Text Reranking MTEB GTR-Large mAP 55.36 # 8
Text Reranking MTEB GTR-Base mAP 54.23 # 10
Text Reranking MTEB SGPT-BLOOM-7.1B-msmarco mAP 55.65 # 7
Text Reranking MTEB SGPT-5.8B-msmarco mAP 56.65 # 4
Text Reranking MTEB SGPT-125M-msmarco mAP 50.58 # 17
Text Reranking MTEB SGPT-5.8B-nli mAP 52.33 # 15
Text Reranking MTEB SGPT-125M-nli mAP 47.56 # 21
Text Reranking MTEB Ada Similarity mAP 49.02 # 18
Text Reranking MTEB MPNet-multilingual mAP 53.8 # 12
Text Reranking MTEB MPNet mAP 59.36 # 1
Text Reranking MTEB MiniLM-L12-multilingual mAP 53.62 # 13
Text Reranking MTEB MiniLM-L12 mAP 58.44 # 2
Text Reranking MTEB MiniLM-L6 mAP 58.04 # 3
Text Reranking MTEB LASER2 mAP 41.44 # 27
Text Reranking MTEB LaBSE mAP 48.42 # 19
Text Reranking MTEB SPECTER mAP 48.1 # 20
Text Reranking MTEB Contriever mAP 53.09 # 14
Text Reranking MTEB coCondenser-msmarco mAP 51.84 # 16
Text Reranking MTEB SimCSE-BERT-sup mAP 47.54 # 22
Text Reranking MTEB SimCSE-BERT-unsup mAP 46.47 # 23
Text Reranking MTEB BERT mAP 43.44 # 25
Text Reranking MTEB Komninos mAP 44.75 # 24
Text Reranking MTEB Glove mAP 43.29 # 26
Text Pair Classification MTEB ST5-XXL AP 85.06 # 5
Text Pair Classification MTEB ST5-Large AP 84.97 # 6
Text Pair Classification MTEB ST5-Base AP 85.17 # 4
Text Pair Classification MTEB GTR-Large AP 85.33 # 3
Text Pair Classification MTEB GTR-Base AP 83.85 # 7
Text Pair Classification MTEB SGPT-BLOOM-7.1B-msmarco AP 81.9 # 12
Text Pair Classification MTEB SGPT-5.8B-msmarco AP 82 # 11
Text Pair Classification MTEB SGPT-2.7B-msmarco AP 80.65 # 15
Text Pair Classification MTEB SGPT-1.3B-msmarco AP 79.58 # 16
Text Pair Classification MTEB SGPT-125M-msmarco AP 75.23 # 21
Text Pair Classification MTEB SGPT-5.8B-nli AP 77.03 # 19
Text Pair Classification MTEB SGPT-125M-nli AP 71.78 # 24
Text Pair Classification MTEB Ada Similarity AP 76.86 # 20
Text Pair Classification MTEB MPNet-multilingual AP 80.81 # 14
Text Pair Classification MTEB MPNet AP 83.04 # 8
Text Pair Classification MTEB MiniLM-L12-multilingual AP 78.45 # 18
Text Pair Classification MTEB MiniLM-L6 AP 82.41 # 10
Text Pair Classification MTEB LASER2 AP 68.86 # 27
Text Pair Classification MTEB LaBSE AP 78.87 # 17
Text Pair Classification MTEB SPECTER AP 61.37 # 28
Text Pair Classification MTEB Contriever AP 82.53 # 9
Text Pair Classification MTEB coCondenser-msmarco AP 81.74 # 13
Text Pair Classification MTEB SimCSE-BERT-sup AP 73.68 # 22
Text Clustering MTEB GTR-XXL V-Measure 42.42 # 3
Text Clustering MTEB ST5-Large V-Measure 41.65 # 7
Text Classification MTEB GTR-Large Accuracy 67.14 # 11
Text Pair Classification MTEB SimCSE-BERT-unsup AP 70.33 # 26
Text Pair Classification MTEB BERT AP 56.33 # 29
Text Pair Classification MTEB Komninos AP 72.94 # 23
Text Pair Classification MTEB Glove AP 70.92 # 25
Text Clustering MTEB ST5-XL V-Measure 42.34 # 5
Text Clustering MTEB ST5-Base V-Measure 40.21 # 12
Text Clustering MTEB GTR-XL V-Measure 41.51 # 9
Text Classification MTEB GTR-XL Accuracy 67.11 # 13
Text Clustering MTEB GTR-Large V-Measure 41.6 # 8
Text Clustering MTEB GTR-Base V-Measure 38.63 # 16
Text Clustering MTEB SGPT-BLOOM-7.1B-msmarco V-Measure 38.93 # 15
Text Clustering MTEB SGPT-5.8B-msmarco V-Measure 40.35 # 11
Text Clustering MTEB SGPT-2.7B-msmarco V-Measure 39.83 # 14
Text Clustering MTEB SGPT-1.3B-msmarco V-Measure 39.92 # 13