MTEB: Massive Text Embedding Benchmark

13 Oct 2022  ·  Niklas Muennighoff, Nouamane Tazi, Loïc Magne, Nils Reimers ·

Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding Benchmark (MTEB). MTEB spans 8 embedding tasks covering a total of 58 datasets and 112 languages. Through the benchmarking of 33 models on MTEB, we establish the most comprehensive benchmark of text embeddings to date. We find that no particular text embedding method dominates across all tasks. This suggests that the field has yet to converge on a universal text embedding method and scale it up sufficiently to provide state-of-the-art results on all embedding tasks. MTEB comes with open-source code and a public leaderboard at https://github.com/embeddings-benchmark/mteb.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Text Classification MTEB Glove Accuracy 57.29 # 29
Text Classification MTEB Komninos Accuracy 57.65 # 28
Text Classification MTEB BERT Accuracy 61.66 # 25
Text Classification MTEB SimCSE-BERT-unsup Accuracy 62.5 # 24
Text Classification MTEB SimCSE-BERT-sup Accuracy 67.32 # 10
Text Classification MTEB coCondenser-msmarco Accuracy 64.71 # 19
Text Classification MTEB Contriever Accuracy 66.68 # 14
Text Classification MTEB SPECTER Accuracy 52.37 # 31
Text Classification MTEB LaBSE Accuracy 62.71 # 23
Text Classification MTEB LASER2 Accuracy 53.65 # 30
Text Classification MTEB MiniLM-L6 Accuracy 63.06 # 22
Text Classification MTEB MiniLM-L12 Accuracy 63.21 # 21
Text Classification MTEB MiniLM-L12-multilingual Accuracy 64.3 # 20
Text Classification MTEB MPNet Accuracy 65.07 # 18
Text Classification MTEB MPNet-multilingual Accuracy 67.91 # 8
Text Classification MTEB Ada Similarity Accuracy 70.44 # 4
Text Classification MTEB SGPT-125M-nli Accuracy 61.46 # 26
Text Classification MTEB SGPT-5.8B-nli Accuracy 70.14 # 5
Text Classification MTEB SGPT-125M-msmarco Accuracy 60.72 # 27
Text Classification MTEB SGPT-1.3B-msmarco Accuracy 66.52 # 15
Text Classification MTEB SGPT-2.7B-msmarco Accuracy 67.13 # 12
Text Classification MTEB SGPT-5.8B-msmarco Accuracy 68.13 # 7
Text Classification MTEB SGPT-BLOOM-7.1B-msmarco Accuracy 66.19 # 16
Text Classification MTEB GTR-Base Accuracy 65.25 # 17
Text Classification MTEB GTR-XXL Accuracy 67.41 # 9
Text Classification MTEB ST5-Base Accuracy 69.81 # 6
Text Classification MTEB ST5-Large Accuracy 72.31 # 3
Text Classification MTEB ST5-XL Accuracy 72.84 # 2
Text Clustering MTEB Glove V-Measure 27.73 # 29
Text Clustering MTEB Komninos V-Measure 26.57 # 30
Text Clustering MTEB BERT V-Measure 30.12 # 26
Text Clustering MTEB SimCSE-BERT-unsup V-Measure 29.04 # 28
Text Clustering MTEB SimCSE-BERT-sup V-Measure 33.43 # 24
Text Clustering MTEB coCondenser-msmarco V-Measure 37.64 # 18
Text Clustering MTEB Contriever V-Measure 41.1 # 10
Text Clustering MTEB SPECTER V-Measure 34.06 # 23
Text Clustering MTEB LaBSE V-Measure 29.55 # 27
Text Clustering MTEB LASER2 V-Measure 15.28 # 31
Text Clustering MTEB MiniLM-L6 V-Measure 42.35 # 4
Text Clustering MTEB MiniLM-L12 V-Measure 41.81 # 6
Text Clustering MTEB MiniLM-L12-multilingual V-Measure 37.14 # 20
Text Clustering MTEB MPNet V-Measure 43.69 # 2
Text Clustering MTEB MPNet-multilingual V-Measure 38.4 # 17
Text Clustering MTEB Ada Similarity V-Measure 37.52 # 19
Text Clustering MTEB SGPT-125M-nli V-Measure 30.95 # 25
Text Clustering MTEB SGPT-5.8B-nli V-Measure 36.98 # 21
Text Clustering MTEB SGPT-125M-msmarco V-Measure 35.79 # 22
Text Clustering MTEB SGPT-1.3B-msmarco V-Measure 39.92 # 13
Text Clustering MTEB SGPT-2.7B-msmarco V-Measure 39.83 # 14
Text Clustering MTEB SGPT-5.8B-msmarco V-Measure 40.35 # 11
Text Clustering MTEB SGPT-BLOOM-7.1B-msmarco V-Measure 38.93 # 15
Text Clustering MTEB GTR-Base V-Measure 38.63 # 16
Text Clustering MTEB GTR-Large V-Measure 41.6 # 8
Text Classification MTEB GTR-XL Accuracy 67.11 # 13
Text Clustering MTEB GTR-XL V-Measure 41.51 # 9
Text Clustering MTEB ST5-Base V-Measure 40.21 # 12
Text Clustering MTEB ST5-XL V-Measure 42.34 # 5
Text Pair Classification MTEB Glove AP 70.92 # 25
Text Pair Classification MTEB Komninos AP 72.94 # 23
Text Pair Classification MTEB BERT AP 56.33 # 29
Text Pair Classification MTEB SimCSE-BERT-unsup AP 70.33 # 26
Text Classification MTEB GTR-Large Accuracy 67.14 # 11
Text Clustering MTEB ST5-Large V-Measure 41.65 # 7
Text Clustering MTEB GTR-XXL V-Measure 42.42 # 3
Text Pair Classification MTEB SimCSE-BERT-sup AP 73.68 # 22
Text Pair Classification MTEB coCondenser-msmarco AP 81.74 # 13
Text Pair Classification MTEB Contriever AP 82.53 # 9
Text Pair Classification MTEB SPECTER AP 61.37 # 28
Text Pair Classification MTEB LaBSE AP 78.87 # 17
Text Pair Classification MTEB LASER2 AP 68.86 # 27
Text Pair Classification MTEB MiniLM-L6 AP 82.41 # 10
Text Pair Classification MTEB MiniLM-L12-multilingual AP 78.45 # 18
Text Pair Classification MTEB MPNet AP 83.04 # 8
Text Pair Classification MTEB MPNet-multilingual AP 80.81 # 14
Text Pair Classification MTEB Ada Similarity AP 76.86 # 20
Text Pair Classification MTEB SGPT-125M-nli AP 71.78 # 24
Text Pair Classification MTEB SGPT-5.8B-nli AP 77.03 # 19
Text Pair Classification MTEB SGPT-125M-msmarco AP 75.23 # 21
Text Pair Classification MTEB SGPT-1.3B-msmarco AP 79.58 # 16
Text Pair Classification MTEB SGPT-2.7B-msmarco AP 80.65 # 15
Text Pair Classification MTEB SGPT-5.8B-msmarco AP 82 # 11
Text Pair Classification MTEB SGPT-BLOOM-7.1B-msmarco AP 81.9 # 12
Text Pair Classification MTEB GTR-Base AP 83.85 # 7
Text Pair Classification MTEB GTR-Large AP 85.33 # 3
Text Pair Classification MTEB GTR-XL AP 86.13 # 1
Text Pair Classification MTEB ST5-Base AP 85.17 # 4
Text Pair Classification MTEB ST5-Large AP 84.97 # 6
Text Pair Classification MTEB ST5-XXL AP 85.06 # 5
Text Reranking MTEB Glove mAP 43.29 # 26
Text Reranking MTEB Komninos mAP 44.75 # 24
Text Reranking MTEB BERT mAP 43.44 # 25
Text Reranking MTEB SimCSE-BERT-unsup mAP 46.47 # 23
Text Reranking MTEB SimCSE-BERT-sup mAP 47.54 # 22
Text Reranking MTEB coCondenser-msmarco mAP 51.84 # 16
Text Reranking MTEB Contriever mAP 53.09 # 14
Text Reranking MTEB SPECTER mAP 48.1 # 20
Text Reranking MTEB LaBSE mAP 48.42 # 19
Text Reranking MTEB LASER2 mAP 41.44 # 27
Text Reranking MTEB MiniLM-L6 mAP 58.04 # 3
Text Reranking MTEB MiniLM-L12 mAP 58.44 # 2
Text Reranking MTEB MiniLM-L12-multilingual mAP 53.62 # 13
Text Reranking MTEB MPNet mAP 59.36 # 1
Text Reranking MTEB MPNet-multilingual mAP 53.8 # 12
Text Reranking MTEB Ada Similarity mAP 49.02 # 18
Text Reranking MTEB SGPT-125M-nli mAP 47.56 # 21
Text Reranking MTEB SGPT-5.8B-nli mAP 52.33 # 15
Text Reranking MTEB SGPT-125M-msmarco mAP 50.58 # 17
Text Reranking MTEB SGPT-5.8B-msmarco mAP 56.65 # 4
Text Reranking MTEB SGPT-BLOOM-7.1B-msmarco mAP 55.65 # 7
Text Reranking MTEB GTR-Base mAP 54.23 # 10
Text Reranking MTEB GTR-Large mAP 55.36 # 8
Text Reranking MTEB ST5-Large mAP 54 # 11
Text Reranking MTEB ST5-XXL mAP 56.43 # 5
Text Reranking MTEB GTR-XL mAP 55.96 # 6
Text Retrieval MTEB Glove nDCG@10 21.62 # 23
Text Retrieval MTEB Komninos nDCG@10 21.22 # 24
Text Retrieval MTEB BERT nDCG@10 10.59 # 29
Text Retrieval MTEB SimCSE-BERT-unsup nDCG@10 20.29 # 26
Semantic Textual Similarity MTEB Glove Spearman Correlation 61.85 # 26
Semantic Textual Similarity MTEB Komninos Spearman Correlation 62.47 # 25
Semantic Textual Similarity MTEB BERT Spearman Correlation 54.36 # 29
Semantic Textual Similarity MTEB SimCSE-BERT-unsup Spearman Correlation 74.33 # 22
Text Summarization MTEB Glove Spearman Correlation 28.87 # 17
Text Summarization MTEB BERT Spearman Correlation 29.82 # 13
Text Summarization MTEB SimCSE-BERT-unsup Spearman Correlation 31.15 # 3
Text Summarization MTEB Komninos Spearman Correlation 30.49 # 7
Text Retrieval MTEB SimCSE-BERT-sup nDCG@10 21.82 # 22
Text Retrieval MTEB coCondenser-msmarco nDCG@10 32.96 # 19
Text Retrieval MTEB Contriever nDCG@10 41.88 # 13
Text Retrieval MTEB SPECTER nDCG@10 15.88 # 28
Text Retrieval MTEB LaBSE nDCG@10 18.99 # 27
Text Retrieval MTEB LASER2 nDCG@10 7.93 # 30
Text Retrieval MTEB MiniLM-L6 nDCG@10 41.95 # 12
Text Retrieval MTEB MiniLM-L12 nDCG@10 42.69 # 10
Text Retrieval MTEB MiniLM-L12-multilingual nDCG@10 32.45 # 20
Text Retrieval MTEB MPNet nDCG@10 43.81 # 9
Text Retrieval MTEB MPNet-multilingual nDCG@10 35.34 # 17
Text Retrieval MTEB SGPT-125M-nli nDCG@10 20.9 # 25
Text Retrieval MTEB SGPT-5.8B-nli nDCG@10 32.34 # 21
Text Retrieval MTEB SGPT-125M-msmarco nDCG@10 37.04 # 15
Text Retrieval MTEB SGPT-1.3B-msmarco nDCG@10 44.49 # 8
Text Retrieval MTEB SGPT-2.7B-msmarco nDCG@10 46.54 # 6
Information Retrieval MTEB SGPT-5.8B-msmarco nDCG@10 50.25 # 1
Text Retrieval MTEB SGPT-BLOOM-7.1B-msmarco nDCG@10 48.21 # 3
Text Retrieval MTEB GTR-Base nDCG@10 44.67 # 7
Text Retrieval MTEB GTR-Large nDCG@10 47.42 # 5
Text Retrieval MTEB GTR-XL nDCG@10 47.96 # 4
Text Retrieval MTEB GTR-XXL nDCG@10 48.48 # 2
Text Retrieval MTEB ST5-Base nDCG@10 33.63 # 18
Text Retrieval MTEB ST5-Large nDCG@10 36.71 # 16
Text Pair Classification MTEB ST5-XL AP 86.06 # 2
Text Reranking MTEB ST5-XL mAP 54.71 # 9
Text Retrieval MTEB ST5-XL nDCG@10 38.47 # 14
Text Retrieval MTEB ST5-XXL nDCG@10 42.24 # 11
Semantic Textual Similarity MTEB SimCSE-BERT-sup Spearman Correlation 79.12 # 9
Semantic Textual Similarity MTEB coCondenser-msmarco Spearman Correlation 76.47 # 19
Semantic Textual Similarity MTEB SPECTER Spearman Correlation 61.02 # 27
Semantic Textual Similarity MTEB LaBSE Spearman Correlation 70.8 # 24
Semantic Textual Similarity MTEB LASER2 Spearman Correlation 55.32