TASK
DATASET
MODEL
METRIC NAME
METRIC VALUE
GLOBAL RANK
EXTRA DATA
REMOVE
Knowledge Probing
AminoProbe
GAL 120B (zero-shot)
Accuracy
21
# 1
Knowledge Probing
AminoProbe
OPT (zero-shot)
Accuracy
12
# 7
Knowledge Probing
AminoProbe
GAL 30B (zero-shot)
Accuracy
21
# 1
Knowledge Probing
AminoProbe
BLOOM (zero-shot)
Accuracy
14
# 5
Knowledge Probing
AminoProbe
GPT-3 (text-davinci-002) (zero-shot)
Accuracy
14
# 5
Knowledge Probing
AminoProbe
GAL 125M (zero-shot)
Accuracy
12
# 7
Knowledge Probing
AminoProbe
GAL 6.7B (zero-shot)
Accuracy
17
# 3
Knowledge Probing
AminoProbe
GAL 1.3B (zero-shot)
Accuracy
16
# 4
Common Sense Reasoning
ARC (Challenge)
BLOOM (few-shot, k=5)
Accuracy
32.9
# 44
Common Sense Reasoning
ARC (Challenge)
OPT (few-shot, k=5)
Accuracy
31.1
# 46
Common Sense Reasoning
ARC (Challenge)
GPT-3 (zero-shot)
Accuracy
51.4
# 27
Common Sense Reasoning
ARC (Challenge)
GAL 120B (zero-shot)
Accuracy
67.9
# 15
Common Sense Reasoning
ARC (Easy)
OPT (5-shot)
Accuracy
37.4
# 41
Common Sense Reasoning
ARC (Easy)
GAL 120B (0-shot)
Accuracy
83.8
# 7
Common Sense Reasoning
ARC (Easy)
GPT-3 (zero-shot)
Accuracy
68.8
# 33
Common Sense Reasoning
ARC (Easy)
BLOOM (5-shot)
Accuracy
40.7
# 39
Molecular Property Prediction
BACE
GAL 1.3B
ROC-AUC
57.6
# 15
Molecular Property Prediction
BACE
GAL 30B
ROC-AUC
72.7
# 12
Molecular Property Prediction
BACE
GAL 125M
ROC-AUC
56.1
# 16
Molecular Property Prediction
BACE
GAL 6.7B
ROC-AUC
58.4
# 14
Molecular Property Prediction
BACE
GAL 120B
ROC-AUC
61.7
# 13
Molecular Property Prediction
BBBP
GAL 1.3B
ROC-AUC
60.4
# 14
Molecular Property Prediction
BBBP
Uni-Mol
ROC-AUC
72.9
# 3
Molecular Property Prediction
BBBP
GAL 30B
ROC-AUC
59.6
# 15
Molecular Property Prediction
BBBP
GAL 120B
ROC-AUC
66.1
# 13
Molecular Property Prediction
BBBP
GAL 6.7B
ROC-AUC
53.5
# 16
Molecular Property Prediction
BBBP
GAL 125M
ROC-AUC
39.3
# 17
Word Sense Disambiguation
BIG-bench (Anachronisms)
OPT 175B
Accuracy
49.1
# 3
Word Sense Disambiguation
BIG-bench (Anachronisms)
GAL 120B (few-shot, k=5)
Accuracy
48.7
# 4
Word Sense Disambiguation
BIG-bench (Anachronisms)
GAL 30B (few-shot, k=5)
Accuracy
47.0
# 5
Word Sense Disambiguation
BIG-bench (Anachronisms)
BLOOM 176B
Accuracy
1.3
# 6
Question Answering
BioASQ
BLOOM (zero-shot)
Accuracy
91.4
# 3
Question Answering
BioASQ
GAL 120B (zero-shot)
Accuracy
94.3
# 2
Question Answering
BioASQ
OPT (zero-shot)
Accuracy
81.4
# 6
Knowledge Probing
BioLAMA
GAL 125M
Accuracy
3.1
# 8
Knowledge Probing
BioLAMA
GPT-3 (text-davinci-002) (zero-shot)
Accuracy
8.4
# 2
Knowledge Probing
BioLAMA
BLOOM (zero-shot)
Accuracy
9.7
# 1
Knowledge Probing
BioLAMA
GAL 1.3B
Accuracy
7.2
# 5
Knowledge Probing
BioLAMA
OPT (zero-shot)
Accuracy
7.1
# 6
Knowledge Probing
BioLAMA
GAL 30B
Accuracy
6.9
# 7
Knowledge Probing
BioLAMA
GAL 6.7B
Accuracy
7.9
# 4
Knowledge Probing
BioLAMA
GAL 120B
Accuracy
8.0
# 3
Protein Structure Prediction
CASPSeq
GAL 125M
Validation perplexity
20.62
# 5
Protein Structure Prediction
CASPSeq
GAL 120B
Validation perplexity
17.26
# 1
Protein Structure Prediction
CASPSeq
GAL 6.7B
Validation perplexity
17.29
# 3
Protein Structure Prediction
CASPSeq
GAL 30B
Validation perplexity
17.27
# 2
Protein Structure Prediction
CASPSeq
GAL 1.3B
Validation perplexity
17.58
# 4
Protein Structure Prediction
CASPSimSeq
GAL 30B
Validation perplexity
15.42
# 2
Protein Structure Prediction
CASPSimSeq
GAL 1.3B
Validation perplexity
17.04
# 4
Protein Function Prediction
CASPSimSeq
GAL 120B
ROUGE-L
0.252
# 1
Protein Structure Prediction
CASPSimSeq
GAL 125M
Validation perplexity
19.18
# 5
Protein Annotation
CASPSimSeq
GAL 125M
F1 score
0.105
# 5
Protein Function Prediction
CASPSimSeq
GAL 1.3B
ROUGE-L
0.069
# 4
Protein Function Prediction
CASPSimSeq
GAL 125M
ROUGE-L
0.062
# 5
Protein Function Prediction
CASPSimSeq
GAL 30B
ROUGE-L
0.137
# 2
Protein Function Prediction
CASPSimSeq
GAL 6.7B
ROUGE-L
0.109
# 3
Protein Structure Prediction
CASPSimSeq
GAL 6.7B
Validation perplexity
16.35
# 3
Protein Structure Prediction
CASPSimSeq
GAL 120B
Validation perplexity
12.77
# 1
Protein Annotation
CASPSimSeq
GAL 1.3B
F1 score
0.174
# 4
Protein Annotation
CASPSimSeq
GAL 6.7B
F1 score
0.184
# 3
Protein Annotation
CASPSimSeq
GAL 120B
F1 score
0.219
# 2
Protein Annotation
CASPSimSeq
GAL 30B
F1 score
0.22
# 1
Knowledge Probing
Chemical Reactions
OPT (zero-shot)
Accuracy
12.7
# 7
Knowledge Probing
Chemical Reactions
GAL 125M
Accuracy
0.3
# 8
Knowledge Probing
Chemical Reactions
GPT-3 (text-davinci-002) (zero-shot)
Accuracy
35.1
# 3
Knowledge Probing
Chemical Reactions
BLOOM (zero-shot)
Accuracy
22.4
# 5
Knowledge Probing
Chemical Reactions
GAL 1.3B
Accuracy
14.4
# 6
Knowledge Probing
Chemical Reactions
GAL 6.7B
Accuracy
26.4
# 4
Knowledge Probing
Chemical Reactions
GAL 120B
Accuracy
43.1
# 1
Knowledge Probing
Chemical Reactions
GAL 30B
Accuracy
36.5
# 2
Molecular Property Prediction
ClinTox
GAL 30B
ROC-AUC
82.2
# 9
Molecular Property Prediction
ClinTox
GAL 30B
Molecules (M)
2
# 6
Molecular Property Prediction
ClinTox
GAL 125M
ROC-AUC
51.8
# 17
Molecular Property Prediction
ClinTox
GAL 125M
Molecules (M)
2
# 6
Molecular Property Prediction
ClinTox
GAL 6.7B
ROC-AUC
78.4
# 11
Molecular Property Prediction
ClinTox
GAL 6.7B
Molecules (M)
2
# 6
Molecular Property Prediction
ClinTox
GAL 1.3B
ROC-AUC
58.9
# 15
Molecular Property Prediction
ClinTox
GAL 1.3B
Molecules (M)
2
# 6
Molecular Property Prediction
ClinTox
GAL 120B
ROC-AUC
82.6
# 8
Molecular Property Prediction
ClinTox
GAL 120B
Molecules (M)
2
# 6
Citation Prediction
Contextual Citations
GAL 125M
Accuracy
7.1
# 6
Citation Prediction
Contextual Citations
Dense Retriever (fine-tuned)
Accuracy
8.2
# 5
Citation Prediction
Contextual Citations
GAL 1.3B
Accuracy
15.9
# 4
Citation Prediction
Contextual Citations
GAL 6.7B
Accuracy
23
# 3
Citation Prediction
Contextual Citations
GAL 30B
Accuracy
31.5
# 2
Citation Prediction
Contextual Citations
GAL 120B
Accuracy
36.6
# 1
Citation Prediction
Contextual Citations
Sparse Retriever
Accuracy
5.3
# 7
Citation Prediction
Contextual Citations
Dense Retriever
Accuracy
1.6
# 8
Stereotypical Bias Analysis
CrowS-Pairs
GAL 120B
Gender
51.9
# 1
Stereotypical Bias Analysis
CrowS-Pairs
GAL 120B
Religion
51.9
# 1
Stereotypical Bias Analysis
CrowS-Pairs
GAL 120B
Race/Color
59.9
# 2
Stereotypical Bias Analysis
CrowS-Pairs
GAL 120B
Sexual Orientation
77.4
# 2
Stereotypical Bias Analysis
CrowS-Pairs
GAL 120B
Age
69
# 3
Stereotypical Bias Analysis
CrowS-Pairs
GAL 120B
Nationality
51.6
# 1
Stereotypical Bias Analysis
CrowS-Pairs
GAL 120B
Disability
66.7
# 1
Stereotypical Bias Analysis
CrowS-Pairs
GAL 120B
Physical Appearance
58.7
# 1
Stereotypical Bias Analysis
CrowS-Pairs
GAL 120B
Socioeconomic status
65.7
# 1
Stereotypical Bias Analysis
CrowS-Pairs
GAL 120B
Overall
60.5
# 4
Citation Prediction
Extended Citations
Dense Retriever (fine-tuned)
Accuracy
11.8
# 6
Citation Prediction
Extended Citations
GAL 1.3B
Accuracy
45.5
# 4
Citation Prediction
Extended Citations
GAL 6.7B
Accuracy
60
# 3
Citation Prediction
Extended Citations
GAL 30B
Accuracy
66.4
# 2
Citation Prediction
Extended Citations
GAL 125M
Accuracy
6.4
# 8
Citation Prediction
Extended Citations
GAL 120B
Accuracy
69.1
# 1
Citation Prediction
Extended Citations
Sparse Retriever
Accuracy
17.3
# 5
Citation Prediction
Extended Citations
Dense Retriever
Accuracy
8.8
# 7
Knowledge Probing
Galaxy Clusters
GAL 125M
Accuracy
6.7
# 8
Knowledge Probing
Galaxy Clusters
GAL 120B
Accuracy
24.2
# 1
Knowledge Probing
Galaxy Clusters
GAL 6.7B
Accuracy
17.5
# 5
Knowledge Probing
Galaxy Clusters
OPT (zero-shot)
Accuracy
21.7
# 2
Knowledge Probing
Galaxy Clusters
GAL 1.3B
Accuracy
14.2
# 7
Knowledge Probing
Galaxy Clusters
BLOOM (zero-shot)
Accuracy
15
# 6
Knowledge Probing
Galaxy Clusters
GPT-3 (text-davinci-002) (zero-shot)
Accuracy
20.8
# 3
Knowledge Probing
Galaxy Clusters
GAL 30B
Accuracy
20
# 4
Molecular Property Prediction
HIV dataset
GAL 30B
AUC
0.759
# 5
Molecular Property Prediction
HIV dataset
Uni-Mol
AUC
0.808
# 2
Molecular Property Prediction
HIV dataset
GAL 120B
AUC
0.745
# 6
Molecular Property Prediction
HIV dataset
GAL 6.7B
AUC
0.722
# 8
Molecular Property Prediction
HIV dataset
GAL 1.3B
AUC
0.724
# 7
Molecular Property Prediction
HIV dataset
GAL 125M
AUC
0.702
# 9
IUPAC Name Prediction
IUPAC
GAL 125M
Accuracy
0
# 5
IUPAC Name Prediction
IUPAC
GAL 6.7B
Accuracy
10.7
# 3
IUPAC Name Prediction
IUPAC
GAL 30B
Accuracy
15.4
# 2
IUPAC Name Prediction
IUPAC
GAL 1.3B
Accuracy
2.5
# 4
IUPAC Name Prediction
IUPAC
GAL 120B
Accuracy
39.2
# 1
Knowledge Probing
Latex Equations
GAL 120B (zero-shot)
Accuracy
68.2
# 1
Knowledge Probing
Latex Equations
OPT (zero-shot)
Accuracy
8.9
# 7
Knowledge Probing
Latex Equations
BLOOM (zero-shot)
Accuracy
21.4
# 5
Knowledge Probing
Latex Equations
GPT-3 (text-davinci-002) (zero-shot)
Accuracy
49
# 3
Knowledge Probing
Latex Equations
GAL 125M (zero-shot)
Accuracy
0.5
# 8
Knowledge Probing
Latex Equations
GAL 1.3B (zero-shot)
Accuracy
20.5
# 6
Knowledge Probing
Latex Equations
GAL 6.7B (zero-shot)
Accuracy
41.7
# 4
Knowledge Probing
Latex Equations
GAL 30B (zero-shot)
Accuracy
51.5
# 2
Math Word Problem Solving
MATH
GAL 120B <work>
Accuracy
16.6
# 81
Math Word Problem Solving
MATH
GAL 120B <work>
Parameters (Billions)
120
# 8
Math Word Problem Solving
MATH
PaLM 540B (5-shot) mCoT
Accuracy
8.8
# 91
Math Word Problem Solving
MATH
PaLM 540B (5-shot) mCoT
Parameters (Billions)
540
# 1
Math Word Problem Solving
MATH
GAL 30B <work>
Accuracy
11.4
# 88
Math Word Problem Solving
MATH
GAL 30B <work>
Parameters (Billions)
30
# 36
Math Word Problem Solving
MATH
GPT-3 175B (8-shot)
Accuracy
5.2
# 102
Math Word Problem Solving
MATH
GPT-3 175B (8-shot)
Parameters (Billions)
175
# 5
Math Word Problem Solving
MATH
GAL 30B (5-shot) mCoT
Accuracy
12.7
# 86
Math Word Problem Solving
MATH
GAL 30B (5-shot) mCoT
Parameters (Billions)
30
# 36
Math Word Problem Solving
MATH
Minerva 540B (5-shot) mCoT
Accuracy
33.6
# 56
Math Word Problem Solving
MATH
Minerva 540B (5-shot) mCoT
Parameters (Billions)
540
# 1
Math Word Problem Solving
MATH
GAL 120B (5-shot) mCoT
Accuracy
20.4
# 77
Math Word Problem Solving
MATH
GAL 120B (5-shot) mCoT
Parameters (Billions)
120
# 8
Multiple Choice Question Answering (MCQA)
MedMCQA
GAL 120B (zero-shot)
Dev Set (Acc-%)
0.529
# 8
Multiple Choice Question Answering (MCQA)
MedMCQA
BLOOM (few-shot, k=5)
Dev Set (Acc-%)
0.325
# 16
Multiple Choice Question Answering (MCQA)
MedMCQA
OPT (few-shot, k=5)
Dev Set (Acc-%)
0.296
# 17
Question Answering
MedQA
GAL 120B (zero-shot)
Accuracy
44.4
# 15
Question Answering
MedQA
BLOOM (few-shot, k=5)
Accuracy
23.3
# 21
Question Answering
MedQA
OPT (few-shot, k=5)
Accuracy
22.8
# 22
Knowledge Probing
Mineral Groups
GAL 1.3B
Accuracy
10.3
# 4
Knowledge Probing
Mineral Groups
OPT (zero-shot)
Accuracy
1.6
# 7
Knowledge Probing
Mineral Groups
GAL 6.7B
Accuracy
8.7
# 6
Knowledge Probing
Mineral Groups
GAL 30B
Accuracy
17.5
# 3
Knowledge Probing
Mineral Groups
BLOOM (zero-shot)
Accuracy
10.3
# 4
Knowledge Probing
Mineral Groups
GPT-3 (text-davinci-002) (zero-shot)
Accuracy
18.3
# 2
Knowledge Probing
Mineral Groups
GAL 125M
Accuracy
0.0
# 8
Knowledge Probing
Mineral Groups
GAL 120B
Accuracy
29.4
# 1
Multi-task Language Understanding
MMLU
GAL 120B (zero-shot)
Average (%)
52.6
# 62
Multiple Choice Question Answering (MCQA)
MMLU (Abstract Algebra)
Gopher (few-shot, k=5)
Accuracy
25
# 4
Multiple Choice Question Answering (MCQA)
MMLU (Abstract Algebra)
Chinchilla (few-shot, k=5)
Accuracy
31
# 2
Multiple Choice Question Answering (MCQA)
MMLU (Abstract Algebra)
OPT (few-shot, k=5)
Accuracy
21
# 5
Multiple Choice Question Answering (MCQA)
MMLU (Abstract Algebra)
GAL 120B (zero-shot)
Accuracy
27
# 3
Multiple Choice Question Answering (MCQA)
MMLU (Abstract Algebra)
GAL 30B (zero-shot)
Accuracy
33.3
# 1
Multiple Choice Question Answering (MCQA)
MMLU (Astronomy)
Gopher (few-shot, k=5)
Accuracy
65.8
# 2
Multiple Choice Question Answering (MCQA)
MMLU (Astronomy)
OPT (few-shot, k=5)
Accuracy
23.0
# 5
Multiple Choice Question Answering (MCQA)
MMLU (Astronomy)
BLOOM (few-shot, k=5)
Accuracy
25.7
# 4
Multiple Choice Question Answering (MCQA)
MMLU (Astronomy)
Chinchilla (few-shot, k=5)
Accuracy
73.0
# 1
Multiple Choice Question Answering (MCQA)
MMLU (Astronomy)
GAL 120B (zero-shot)
Accuracy
65.1
# 3
Multiple Choice Question Answering (MCQA)
MMLU (College Biology)
BLOOM (few-shot, k=5)
Accuracy
28.5
# 8
Multiple Choice Question Answering (MCQA)
MMLU (College Biology)
Gopher (few-shot, k=5)
Accuracy
70.8
# 5
Multiple Choice Question Answering (MCQA)
MMLU (College Biology)
GAL 120B (zero-shot)
Accuracy
68.8
# 6
Multiple Choice Question Answering (MCQA)
MMLU (College Biology)
Chinchilla (few-shot, k=5)
Accuracy
79.9
# 4
Multiple Choice Question Answering (MCQA)
MMLU (College Biology)
OPT (few-shot, k=5)
Accuracy
30.6
# 7
Multiple Choice Question Answering (MCQA)
MMLU (College Chemistry)
Chinchilla (few-shot, k=5)
Accuracy
51
# 1
Multiple Choice Question Answering (MCQA)
MMLU (College Chemistry)
Gopher (few-shot, k=5)
Accuracy
45
# 3
Multiple Choice Question Answering (MCQA)
MMLU (College Chemistry)
BLOOM (few-shot, k=5)
Accuracy
19
# 5
Multiple Choice Question Answering (MCQA)
MMLU (College Chemistry)
GAL 120B (zero-shot)
Accuracy
46
# 2
Multiple Choice Question Answering (MCQA)
MMLU (College Chemistry)
OPT (few-shot, k=5)
Accuracy
30
# 4
Multiple Choice Question Answering (MCQA)
MMLU (College Computer Science)
OPT (few-shot, k=5)
Accuracy
17.0
# 3
Multiple Choice Question Answering (MCQA)
MMLU (College Computer Science)
BLOOM (few-shot, k=5)
Accuracy
6.0
# 4
Multiple Choice Question Answering (MCQA)
MMLU (College Computer Science)
Chinchilla (few-shot, k=5)
Accuracy
51.0
# 1
Multiple Choice Question Answering (MCQA)
MMLU (College Computer Science)
GAL 120B (zero-shot)
Accuracy
49
# 2
Multiple Choice Question Answering (MCQA)
MMLU (College Mathematics)
OPT (few-shot, k=5)
Accuracy
33
# 3
Multiple Choice Question Answering (MCQA)
MMLU (College Mathematics)
BLOOM (few-shot, k=5)
Accuracy
25
# 5
Multiple Choice Question Answering (MCQA)
MMLU (College Mathematics)
GAL 120B (zero-shot)
Accuracy
43
# 1
Multiple Choice Question Answering (MCQA)
MMLU (College Mathematics)
Chinchilla (few-shot, k=5)
Accuracy
32
# 4
Multiple Choice Question Answering (MCQA)
MMLU (College Mathematics)
Gopher (few-shot, k=5)
Accuracy
37
# 2
Multiple Choice Question Answering (MCQA)
MMLU (College Physics)
OPT (few-shot, k=5)
Accuracy
21.6
# 4
Multiple Choice Question Answering (MCQA)
MMLU (College Physics)
GAL 120B (zero-shot)
Accuracy
42.2
# 2
Multiple Choice Question Answering (MCQA)
MMLU (College Physics)
Chinchilla (few-shot, k=5)
Accuracy
46.1
# 1
Multiple Choice Question Answering (MCQA)
MMLU (College Physics)
BLOOM (few-shot, k=5)
Accuracy
18.6
# 5
Multiple Choice Question Answering (MCQA)
MMLU (College Physics)
Gopher (few-shot, k=5)
Accuracy
34.3
# 3
Multiple Choice Question Answering (MCQA)
MMLU (Econometrics)
Chinchilla (few-shot, k=5)
Accuracy
38.6
# 3
Multiple Choice Question Answering (MCQA)
MMLU (Econometrics)
Gopher (few-shot, k=5)
Accuracy
43
# 1
Multiple Choice Question Answering (MCQA)
MMLU (Econometrics)
GAL 120B (zero-shot)
Accuracy
42.1
# 2
Multiple Choice Question Answering (MCQA)
MMLU (Econometrics)
BLOOM (few-shot, k=5)
Accuracy
23.7
# 4
Multiple Choice Question Answering (MCQA)
MMLU (Econometrics)
OPT (few-shot, k=5)
Accuracy
21
# 5
Multiple Choice Question Answering (MCQA)
MMLU (Electrical Engineer)
Gopher (few-shot, k=5)
Accuracy
60
# 3
Multiple Choice Question Answering (MCQA)
MMLU (Electrical Engineer)
BLOOM (few-shot, k=5)
Accuracy
32.4
# 5
Multiple Choice Question Answering (MCQA)
MMLU (Electrical Engineer)
Chinchilla (few-shot, k=5)
Accuracy
62.1
# 2
Multiple Choice Question Answering (MCQA)
MMLU (Electrical Engineer)
GAL 120B (zero-shot)
Accuracy
62.8
# 1
Multiple Choice Question Answering (MCQA)
MMLU (Electrical Engineer)
OPT (few-shot, k=5)
Accuracy
36.6
# 4
Multiple Choice Question Answering (MCQA)
MMLU (Elementary Mathematics)
Gopher (few-shot, k=5)
Accuracy
33.6
# 3
Multiple Choice Question Answering (MCQA)
MMLU (Elementary Mathematics)
BLOOM (few-shot, k=5)
Accuracy
27.6
# 4
Multiple Choice Question Answering (MCQA)
MMLU (Elementary Mathematics)
Chinchilla (few-shot, k=5)
Accuracy
41.5
# 1
Multiple Choice Question Answering (MCQA)
MMLU (Elementary Mathematics)
OPT (few-shot, k=5)
Accuracy
25.7
# 5
Multiple Choice Question Answering (MCQA)
MMLU (Elementary Mathematics)
GAL 120B (zero-shot)
Accuracy
38.1
# 2
Multiple Choice Question Answering (MCQA)
MMLU (Formal Logic)
BLOOM (few-shot, k=5)
Accuracy
26.2
# 5
Multiple Choice Question Answering (MCQA)
MMLU (Formal Logic)
Gopher (few-shot, k=5)
Accuracy
35.7
# 1
Multiple Choice Question Answering (MCQA)
MMLU (Formal Logic)
GAL 120B (zero-shot)
Accuracy
32.5
# 3
Multiple Choice Question Answering (MCQA)
MMLU (Formal Logic)
OPT (few-shot, k=5)
Accuracy
29.4
# 4
Multiple Choice Question Answering (MCQA)
MMLU (Formal Logic)
Chinchilla (few-shot, k=5)
Accuracy
33.3
# 2
Multiple Choice Question Answering (MCQA)
MMLU (High School Biology)
OPT (few-shot, k=5)
Accuracy
27.7
# 5
Multiple Choice Question Answering (MCQA)
MMLU (High School Biology)
BLOOM (few-shot, k=5)
Accuracy
29.4
# 4
Multiple Choice Question Answering (MCQA)
MMLU (High School Biology)
Gopher (few-shot, k=5)
Accuracy
71.3
# 2
Multiple Choice Question Answering (MCQA)
MMLU (High School Biology)
Chinchilla (few-shot, k=5)
Accuracy
80.3
# 1
Multiple Choice Question Answering (MCQA)
MMLU (High School Biology)
GAL 120B (zero-shot)
Accuracy
69.4
# 3
Multiple Choice Question Answering (MCQA)
MMLU (High School Chemistry)
OPT (few-shot, k=5)
Accuracy
21.7
# 4
Multiple Choice Question Answering (MCQA)
MMLU (High School Chemistry)
BLOOM (few-shot, k=5)
Accuracy
23.2
# 3
Multiple Choice Question Answering (MCQA)
MMLU (High School Chemistry)
Chinchilla (few-shot, k=5)
Accuracy
58.1
# 1
Multiple Choice Question Answering (MCQA)
MMLU (High School Chemistry)
GAL 120B (zero-shot)
Accuracy
47.8
# 2
Multiple Choice Question Answering (MCQA)
MMLU (High School Computer Science)
GAL 120B (zero-shot)
Accuracy
70
# 1
Multiple Choice Question Answering (MCQA)
MMLU (High School Computer Science)
BLOOM (few-shot, k=5)
Accuracy
25
# 5
Multiple Choice Question Answering (MCQA)
MMLU (High School Computer Science)
Gopher (few-shot, k=5)
Accuracy
54
# 3
Multiple Choice Question Answering (MCQA)
MMLU (High School Computer Science)
OPT (few-shot, k=5)
Accuracy
30
# 4
Multiple Choice Question Answering (MCQA)
MMLU (High School Computer Science)
Chinchilla (few-shot, k=5)
Accuracy
58
# 2
Multiple Choice Question Answering (MCQA)
MMLU (High School Mathematics)
Chinchilla (few-shot, k=5)
Accuracy
31.9
# 2
Multiple Choice Question Answering (MCQA)
MMLU (High School Mathematics)
GAL 120B (zero-shot)
Accuracy
32.6
# 1
Multiple Choice Question Answering (MCQA)
MMLU (High School Mathematics)
Gopher (few-shot, k=5)
Accuracy
23.7
# 5
Multiple Choice Question Answering (MCQA)
MMLU (High School Mathematics)
BLOOM (few-shot, k=5)
Accuracy
27
# 3
Multiple Choice Question Answering (MCQA)
MMLU (High School Mathematics)
OPT (few-shot, k=5)
Accuracy
24.4
# 4
Multiple Choice Question Answering (MCQA)
MMLU (High School Physics)
OPT (few-shot, k=5)
Accuracy
29.8
# 3
Multiple Choice Question Answering (MCQA)
MMLU (High School Physics)
Chinchilla (few-shot, k=5)
Accuracy
36.4
# 1
Multiple Choice Question Answering (MCQA)
MMLU (High School Physics)
BLOOM (few-shot, k=5)
Accuracy
25.2
# 4
Multiple Choice Question Answering (MCQA)
MMLU (High School Physics)
GAL 120B (zero-shot)
Accuracy
33.8
# 2
Multiple Choice Question Answering (MCQA)
MMLU (High School Statistics)
Chinchilla (few-shot, k=5)
Accuracy
58.8
# 1
Multiple Choice Question Answering (MCQA)
MMLU (High School Statistics)
BLOOM (few-shot, k=5)
Accuracy
19.4
# 5
Multiple Choice Question Answering (MCQA)
MMLU (High School Statistics)
GAL 120B (zero-shot)
Accuracy
41.2
# 4
Multiple Choice Question Answering (MCQA)
MMLU (High School Statistics)
OPT (few-shot, k=5)
Accuracy
43.5
# 3
Multiple Choice Question Answering (MCQA)
MMLU (High School Statistics)
Gopher (few-shot, k=5)
Accuracy
50
# 2
Multiple Choice Question Answering (MCQA)
MMLU (Machine Learning)
Chinchilla (few-shot, k=5)
Accuracy
41.1
# 1
Multiple Choice Question Answering (MCQA)
MMLU (Machine Learning)
GAL 120B (zero-shot)
Accuracy
38.4
# 2
Multiple Choice Question Answering (MCQA)
MMLU (Machine Learning)
BLOOM (few-shot, k=5)
Accuracy
25
# 4
Multiple Choice Question Answering (MCQA)
MMLU (Machine Learning)
OPT (few-shot, k=5)
Accuracy
28.6
# 3
Mathematical Reasoning
MMLU (Mathematics)
GAL 1.3B
Accuracy
27.1
# 10
Mathematical Reasoning
MMLU (Mathematics)
Chinchilla (5-shot)
Accuracy
35.7
# 4
Mathematical Reasoning
MMLU (Mathematics)
GAL 30B <work>
Accuracy
37.1
# 2
Mathematical Reasoning
MMLU (Mathematics)
GAL 6.7B <work>
Accuracy
28
# 9
Mathematical Reasoning
MMLU (Mathematics)
GAL 1.3B <work>
Accuracy
24.6
# 13
Mathematical Reasoning
MMLU (Mathematics)
GAL 120B
Accuracy
35.8
# 3
Mathematical Reasoning
MMLU (Mathematics)
GAL 30B
Accuracy
29.9
# 7
Mathematical Reasoning
MMLU (Mathematics)
BLOOM (5-shot)
Accuracy
26.4
# 12
Mathematical Reasoning
MMLU (Mathematics)
OPT (5-shot)
Accuracy
26.7
# 11
Mathematical Reasoning
MMLU (Mathematics)
GAL 6.7B
Accuracy
29.2
# 8
Mathematical Reasoning
MMLU (Mathematics)
GAL 120B <work>
Accuracy
41.3
# 1
Mathematical Reasoning
MMLU (Mathematics)
Gopher (5-shot)
Accuracy
30.6
# 6
Multiple Choice Question Answering (MCQA)
MMLU (Medical Genetics)
BLOOM (few-shot, k=5)
Accuracy
36
# 7
Multiple Choice Question Answering (MCQA)
MMLU (Medical Genetics)
Chinchilla (few-shot, k=5)
Accuracy
69
# 5
Multiple Choice Question Answering (MCQA)
MMLU (Medical Genetics)
GAL 30B (zero-shot)
Accuracy
70
# 4
Multiple Choice Question Answering (MCQA)
MMLU (Medical Genetics)
GAL 120B (zero-shot)
Accuracy
68
# 6
Multiple Choice Question Answering (MCQA)
MMLU (Medical Genetics)
OPT (few-shot, k=5)
Accuracy
35
# 8
Molecular Property Prediction
MoleculeNet
GAL 125M
AUC
0.581
# 5
Molecular Property Prediction
MoleculeNet
GAL 30B
AUC
0.69
# 2
Molecular Property Prediction
MoleculeNet
GAL 1.3B
AUC
0.619
# 4
Molecular Property Prediction
MoleculeNet
Uni-Mol
AUC
0.77
# 1
Molecular Property Prediction
MoleculeNet
GAL 6.7B
AUC
0.64
# 3
Protein Function Prediction
PaenSeq
GAL 120B
ROUGE-L
0.272
# 1
Protein Structure Prediction
PaenSeq
GAL 6.7B
Validation perplexity
7.76
# 3
Protein Function Prediction
PaenSeq
GAL 125M
ROUGE-L
0.073
# 5
Protein Function Prediction
PaenSeq
GAL 1.3B
ROUGE-L
0.084
# 4
Protein Structure Prediction
PaenSeq
GAL 30B
Validation perplexity
4.28
# 2
Protein Function Prediction
PaenSeq
GAL 6.7B
ROUGE-L
0.137
# 3
Protein Structure Prediction
PaenSeq
GAL 125M
Validation perplexity
16.35
# 5
Protein Function Prediction
PaenSeq
GAL 30B
ROUGE-L
0.196
# 2
Protein Annotation
PaenSeq
GAL 1.3B
F1 score
0.26
# 4
Protein Annotation
PaenSeq
GAL 125M
F1 score
0.093
# 5
Protein Structure Prediction
PaenSeq
GAL 1.3B
Validation perplexity
12.53
# 4
Protein Annotation
PaenSeq
GAL 120B
F1 score
0.545
# 1
Protein Annotation
PaenSeq
GAL 6.7B
F1 score
0.333
# 3
Protein Structure Prediction
PaenSeq
GAL 120B
Validation perplexity
3.14
# 1
Protein Annotation
PaenSeq
GAL 30B
F1 score
0.426
# 2
Question Answering
PubMedQA
GAL 120B (zero-shot)
Accuracy
77.6
# 8
Question Answering
PubMedQA
OPT (zero-shot)
Accuracy
70.2
# 18
Question Answering
PubMedQA
BLOOM (zero-shot)
Accuracy
73.6
# 15
Citation Prediction
PWC Citations
Sparse Retriever
Accuracy
30.9
# 4
Citation Prediction
PWC Citations
Dense Retriever (fine-tuned)
Accuracy
27.6
# 5
Citation Prediction
PWC Citations
Dense Retriever
Accuracy
16.4
# 7
Citation Prediction
PWC Citations
GAL 120B
Accuracy
51.9
# 1
Citation Prediction
PWC Citations
GAL 30B
Accuracy
44.7
# 2
Citation Prediction
PWC Citations
GAL 6.7B
Accuracy
32
# 3
Citation Prediction
PWC Citations
GAL 1.3B
Accuracy
18.5
# 6
Citation Prediction
PWC Citations
GAL 125M
Accuracy
7
# 8
Molecular Property Prediction
SIDER
GAL 6.7B
ROC-AUC
55.9
# 15
Molecular Property Prediction
SIDER
GAL 1.3B
ROC-AUC
54.0
# 17
Molecular Property Prediction
SIDER
GAL 120B
ROC-AUC
63.2
# 11
Molecular Property Prediction
SIDER
GAL 30B
ROC-AUC
61.3
# 13
Molecular Property Prediction
SIDER
GAL 125M
ROC-AUC
55.9
# 15
Bias Detection
StereoSet
GPT-3 (text-davinci-002)
ICAT Score
60.8
# 10
Bias Detection
StereoSet
GPT-3 (text-davinci-002)
LMS
77.6
# 1
Bias Detection
StereoSet
GPT-3 (text-davinci-002)
SS
60.8
# 1
Bias Detection
StereoSet
OPT 175B
ICAT Score
60
# 11
Bias Detection
StereoSet
OPT 175B
LMS
74.8
# 3
Bias Detection
StereoSet
OPT 175B
SS
59.9
# 2
Bias Detection
StereoSet
GAL 120B
ICAT Score
65.6
# 8
Bias Detection
StereoSet
GAL 120B
LMS
75
# 2
Bias Detection
StereoSet
GAL 120B
SS
56.2
# 3
Molecular Property Prediction
Tox21
GAL 125M
ROC-AUC
54.3
# 17
Molecular Property Prediction
Tox21
GAL 6.7B
ROC-AUC
63.9
# 15
Molecular Property Prediction
Tox21
GAL 1.3B
ROC-AUC
60.6
# 16
Molecular Property Prediction
Tox21
GAL 120B
ROC-AUC
68.9
# 13
Molecular Property Prediction
Tox21
GAL 30B
ROC-AUC
68.5
# 14
Molecular Property Prediction
Tox21
Uni-Mol
ROC-AUC
79.6
# 3
Question Answering
TruthfulQA
GAL 125M
MC1
0.19
# 20
Question Answering
TruthfulQA
OPT 175B
MC1
0.21
# 17
Question Answering
TruthfulQA
GAL 6.7B
MC1
0.19
# 20
Question Answering
TruthfulQA
GAL 30B
MC1
0.24
# 12
Question Answering
TruthfulQA
GAL 1.3B
MC1
0.19
# 20
Question Answering
TruthfulQA
GAL 120B
MC1
0.26
# 10
Protein Annotation
UniProtSeq
GAL 1.3B
F1 score
0.219
# 4
Protein Annotation
UniProtSeq
GAL 120B
F1 score
0.487
# 1
Protein Annotation
UniProtSeq
GAL 125M
F1 score
0.152
# 5
Protein Structure Prediction
UniProtSeq
GAL 120B
Validation perplexity
5.54
# 1
Protein Annotation
UniProtSeq
GAL 30B
F1 score
0.408
# 2
Protein Structure Prediction
UniProtSeq
GAL 30B
Validation perplexity
8.23
# 2
Protein Structure Prediction
UniProtSeq
GAL 6.7B
Validation perplexity
11.58
# 3
Protein Structure Prediction
UniProtSeq
GAL 1.3B
Validation perplexity
15.82
# 4
Protein Function Prediction
UniProtSeq
GAL 120B
ROUGE-L
0.252
# 1
Protein Structure Prediction
UniProtSeq
GAL 125M
Validation perplexity
19.05
# 5
Protein Function Prediction
UniProtSeq
GAL 125M
ROUGE-L
0.061
# 5
Protein Function Prediction
UniProtSeq
GAL 1.3B
ROUGE-L
0.079
# 4
Protein Function Prediction
UniProtSeq
GAL 6.7B
ROUGE-L
0.111
# 3
Protein Function Prediction
UniProtSeq
GAL 30B
ROUGE-L
0.186
# 2
Protein Annotation
UniProtSeq
GAL 6.7B
F1 score
0.251
# 3