Upload an image to customize your repositoryโs social media preview.
Images should be at least 640ร320px (1280ร640px for best display).
TASK | DATASET | MODEL | METRIC NAME | METRIC VALUE | GLOBAL RANK | EXTRA DATA |
REMOVE |
---|---|---|---|---|---|---|---|
Knowledge Probing | AminoProbe | GAL 120B (zero-shot) | Accuracy | 21 | # 1 |
|
|
Knowledge Probing | AminoProbe | GAL 125M (zero-shot) | Accuracy | 12 | # 7 |
|
|
Knowledge Probing | AminoProbe | GAL 1.3B (zero-shot) | Accuracy | 16 | # 4 |
|
|
Knowledge Probing | AminoProbe | GPT-3 (text-davinci-002) (zero-shot) | Accuracy | 14 | # 5 |
|
|
Knowledge Probing | AminoProbe | BLOOM (zero-shot) | Accuracy | 14 | # 5 |
|
|
Knowledge Probing | AminoProbe | GAL 30B (zero-shot) | Accuracy | 21 | # 1 |
|
|
Knowledge Probing | AminoProbe | OPT (zero-shot) | Accuracy | 12 | # 7 |
|
|
Knowledge Probing | AminoProbe | GAL 6.7B (zero-shot) | Accuracy | 17 | # 3 |
|
|
Common Sense Reasoning | ARC (Challenge) | GPT-3 (zero-shot) | Accuracy | 51.4 | # 31 |
|
|
Common Sense Reasoning | ARC (Challenge) | GAL 120B (zero-shot) | Accuracy | 67.9 | # 18 |
|
|
Common Sense Reasoning | ARC (Challenge) | BLOOM (few-shot, k=5) | Accuracy | 32.9 | # 48 |
|
|
Common Sense Reasoning | ARC (Challenge) | OPT (few-shot, k=5) | Accuracy | 31.1 | # 50 |
|
|
Common Sense Reasoning | ARC (Easy) | BLOOM (5-shot) | Accuracy | 40.7 | # 43 |
|
|
Common Sense Reasoning | ARC (Easy) | GPT-3 (zero-shot) | Accuracy | 68.8 | # 37 |
|
|
Common Sense Reasoning | ARC (Easy) | GAL 120B (0-shot) | Accuracy | 83.8 | # 9 |
|
|
Common Sense Reasoning | ARC (Easy) | OPT (5-shot) | Accuracy | 37.4 | # 45 |
|
|
Molecular Property Prediction | BACE | GAL 1.3B | ROC-AUC | 57.6 | # 17 |
|
|
Molecular Property Prediction | BACE | GAL 120B | ROC-AUC | 61.7 | # 15 |
|
|
Molecular Property Prediction | BACE | GAL 6.7B | ROC-AUC | 58.4 | # 16 |
|
|
Molecular Property Prediction | BACE | GAL 30B | ROC-AUC | 72.7 | # 14 |
|
|
Molecular Property Prediction | BACE | GAL 125M | ROC-AUC | 56.1 | # 18 |
|
|
Molecular Property Prediction | BBBP | GAL 125M | ROC-AUC | 39.3 | # 27 |
|
|
Molecular Property Prediction | BBBP | GAL 30B | ROC-AUC | 59.6 | # 25 |
|
|
Molecular Property Prediction | BBBP | GAL 6.7B | ROC-AUC | 53.5 | # 26 |
|
|
Molecular Property Prediction | BBBP | GAL 1.3B | ROC-AUC | 60.4 | # 24 |
|
|
Molecular Property Prediction | BBBP | Uni-Mol | ROC-AUC | 72.9 | # 13 |
|
|
Molecular Property Prediction | BBBP | GAL 120B | ROC-AUC | 66.1 | # 23 |
|
|
Word Sense Disambiguation | BIG-bench (Anachronisms) | OPT 175B | Accuracy | 49.1 | # 3 |
|
|
Word Sense Disambiguation | BIG-bench (Anachronisms) | GAL 120B (few-shot, k=5) | Accuracy | 48.7 | # 4 |
|
|
Word Sense Disambiguation | BIG-bench (Anachronisms) | GAL 30B (few-shot, k=5) | Accuracy | 47.0 | # 5 |
|
|
Word Sense Disambiguation | BIG-bench (Anachronisms) | BLOOM 176B | Accuracy | 1.3 | # 6 |
|
|
Question Answering | BioASQ | OPT (zero-shot) | Accuracy | 81.4 | # 7 |
|
|
Question Answering | BioASQ | BLOOM (zero-shot) | Accuracy | 91.4 | # 3 |
|
|
Question Answering | BioASQ | GAL 120B (zero-shot) | Accuracy | 94.3 | # 2 |
|
|
Knowledge Probing | BioLAMA | BLOOM (zero-shot) | Accuracy | 9.7 | # 1 |
|
|
Knowledge Probing | BioLAMA | GAL 6.7B | Accuracy | 7.9 | # 4 |
|
|
Knowledge Probing | BioLAMA | GAL 30B | Accuracy | 6.9 | # 7 |
|
|
Knowledge Probing | BioLAMA | OPT (zero-shot) | Accuracy | 7.1 | # 6 |
|
|
Knowledge Probing | BioLAMA | GAL 120B | Accuracy | 8.0 | # 3 |
|
|
Knowledge Probing | BioLAMA | GAL 1.3B | Accuracy | 7.2 | # 5 |
|
|
Knowledge Probing | BioLAMA | GPT-3 (text-davinci-002) (zero-shot) | Accuracy | 8.4 | # 2 |
|
|
Knowledge Probing | BioLAMA | GAL 125M | Accuracy | 3.1 | # 8 |
|
|
Protein Structure Prediction | CASPSeq | GAL 1.3B | Validation perplexity | 17.58 | # 4 |
|
|
Protein Structure Prediction | CASPSeq | GAL 120B | Validation perplexity | 17.26 | # 1 |
|
|
Protein Structure Prediction | CASPSeq | GAL 125M | Validation perplexity | 20.62 | # 5 |
|
|
Protein Structure Prediction | CASPSeq | GAL 30B | Validation perplexity | 17.27 | # 2 |
|
|
Protein Structure Prediction | CASPSeq | GAL 6.7B | Validation perplexity | 17.29 | # 3 |
|
|
Protein Annotation | CASPSimSeq | GAL 30B | F1 score | 0.22 | # 1 |
|
|
Protein Annotation | CASPSimSeq | GAL 120B | F1 score | 0.219 | # 2 |
|
|
Protein Annotation | CASPSimSeq | GAL 6.7B | F1 score | 0.184 | # 3 |
|
|
Protein Annotation | CASPSimSeq | GAL 1.3B | F1 score | 0.174 | # 4 |
|
|
Protein Structure Prediction | CASPSimSeq | GAL 120B | Validation perplexity | 12.77 | # 1 |
|
|
Protein Structure Prediction | CASPSimSeq | GAL 30B | Validation perplexity | 15.42 | # 2 |
|
|
Protein Structure Prediction | CASPSimSeq | GAL 1.3B | Validation perplexity | 17.04 | # 4 |
|
|
Protein Structure Prediction | CASPSimSeq | GAL 6.7B | Validation perplexity | 16.35 | # 3 |
|
|
Protein Structure Prediction | CASPSimSeq | GAL 125M | Validation perplexity | 19.18 | # 5 |
|
|
Protein Annotation | CASPSimSeq | GAL 125M | F1 score | 0.105 | # 5 |
|
|
Protein Function Prediction | CASPSimSeq | GAL 120B | ROUGE-L | 0.252 | # 1 |
|
|
Protein Function Prediction | CASPSimSeq | GAL 6.7B | ROUGE-L | 0.109 | # 3 |
|
|
Protein Function Prediction | CASPSimSeq | GAL 30B | ROUGE-L | 0.137 | # 2 |
|
|
Protein Function Prediction | CASPSimSeq | GAL 125M | ROUGE-L | 0.062 | # 5 |
|
|
Protein Function Prediction | CASPSimSeq | GAL 1.3B | ROUGE-L | 0.069 | # 4 |
|
|
Knowledge Probing | Chemical Reactions | GAL 1.3B | Accuracy | 14.4 | # 6 |
|
|
Knowledge Probing | Chemical Reactions | BLOOM (zero-shot) | Accuracy | 22.4 | # 5 |
|
|
Knowledge Probing | Chemical Reactions | OPT (zero-shot) | Accuracy | 12.7 | # 7 |
|
|
Knowledge Probing | Chemical Reactions | GAL 6.7B | Accuracy | 26.4 | # 4 |
|
|
Knowledge Probing | Chemical Reactions | GAL 125M | Accuracy | 0.3 | # 8 |
|
|
Knowledge Probing | Chemical Reactions | GPT-3 (text-davinci-002) (zero-shot) | Accuracy | 35.1 | # 3 |
|
|
Knowledge Probing | Chemical Reactions | GAL 30B | Accuracy | 36.5 | # 2 |
|
|
Knowledge Probing | Chemical Reactions | GAL 120B | Accuracy | 43.1 | # 1 |
|
|
Molecular Property Prediction | ClinTox | GAL 120B | ROC-AUC | 82.6 | # 9 |
|
|
Molecular Property Prediction | ClinTox | GAL 120B | Molecules (M) | 2 | # 6 |
|
|
Molecular Property Prediction | ClinTox | GAL 1.3B | ROC-AUC | 58.9 | # 16 |
|
|
Molecular Property Prediction | ClinTox | GAL 1.3B | Molecules (M) | 2 | # 6 |
|
|
Molecular Property Prediction | ClinTox | GAL 6.7B | ROC-AUC | 78.4 | # 12 |
|
|
Molecular Property Prediction | ClinTox | GAL 6.7B | Molecules (M) | 2 | # 6 |
|
|
Molecular Property Prediction | ClinTox | GAL 30B | ROC-AUC | 82.2 | # 10 |
|
|
Molecular Property Prediction | ClinTox | GAL 30B | Molecules (M) | 2 | # 6 |
|
|
Molecular Property Prediction | ClinTox | GAL 125M | ROC-AUC | 51.8 | # 18 |
|
|
Molecular Property Prediction | ClinTox | GAL 125M | Molecules (M) | 2 | # 6 |
|
|
Citation Prediction | Contextual Citations | GAL 30B | Accuracy | 31.5 | # 2 |
|
|
Citation Prediction | Contextual Citations | GAL 6.7B | Accuracy | 23 | # 3 |
|
|
Citation Prediction | Contextual Citations | GAL 1.3B | Accuracy | 15.9 | # 4 |
|
|
Citation Prediction | Contextual Citations | GAL 125M | Accuracy | 7.1 | # 6 |
|
|
Citation Prediction | Contextual Citations | Dense Retriever (fine-tuned) | Accuracy | 8.2 | # 5 |
|
|
Citation Prediction | Contextual Citations | Dense Retriever | Accuracy | 1.6 | # 8 |
|
|
Citation Prediction | Contextual Citations | Sparse Retriever | Accuracy | 5.3 | # 7 |
|
|
Citation Prediction | Contextual Citations | GAL 120B | Accuracy | 36.6 | # 1 |
|
|
Stereotypical Bias Analysis | CrowS-Pairs | GAL 120B | Gender | 51.9 | # 1 |
|
|
Stereotypical Bias Analysis | CrowS-Pairs | GAL 120B | Religion | 51.9 | # 1 |
|
|
Stereotypical Bias Analysis | CrowS-Pairs | GAL 120B | Race/Color | 59.9 | # 2 |
|
|
Stereotypical Bias Analysis | CrowS-Pairs | GAL 120B | Sexual Orientation | 77.4 | # 2 |
|
|
Stereotypical Bias Analysis | CrowS-Pairs | GAL 120B | Age | 69 | # 3 |
|
|
Stereotypical Bias Analysis | CrowS-Pairs | GAL 120B | Nationality | 51.6 | # 1 |
|
|
Stereotypical Bias Analysis | CrowS-Pairs | GAL 120B | Disability | 66.7 | # 1 |
|
|
Stereotypical Bias Analysis | CrowS-Pairs | GAL 120B | Physical Appearance | 58.7 | # 1 |
|
|
Stereotypical Bias Analysis | CrowS-Pairs | GAL 120B | Socioeconomic status | 65.7 | # 1 |
|
|
Stereotypical Bias Analysis | CrowS-Pairs | GAL 120B | Overall | 60.5 | # 4 |
|
|
Citation Prediction | Extended Citations | GAL 120B | Accuracy | 69.1 | # 1 |
|
|
Citation Prediction | Extended Citations | Dense Retriever | Accuracy | 8.8 | # 7 |
|
|
Citation Prediction | Extended Citations | Dense Retriever (fine-tuned) | Accuracy | 11.8 | # 6 |
|
|
Citation Prediction | Extended Citations | GAL 30B | Accuracy | 66.4 | # 2 |
|
|
Citation Prediction | Extended Citations | GAL 1.3B | Accuracy | 45.5 | # 4 |
|
|
Citation Prediction | Extended Citations | GAL 6.7B | Accuracy | 60 | # 3 |
|
|
Citation Prediction | Extended Citations | Sparse Retriever | Accuracy | 17.3 | # 5 |
|
|
Citation Prediction | Extended Citations | GAL 125M | Accuracy | 6.4 | # 8 |
|
|
Knowledge Probing | Galaxy Clusters | GPT-3 (text-davinci-002) (zero-shot) | Accuracy | 20.8 | # 3 |
|
|
Knowledge Probing | Galaxy Clusters | GAL 125M | Accuracy | 6.7 | # 8 |
|
|
Knowledge Probing | Galaxy Clusters | GAL 1.3B | Accuracy | 14.2 | # 7 |
|
|
Knowledge Probing | Galaxy Clusters | GAL 120B | Accuracy | 24.2 | # 1 |
|
|
Knowledge Probing | Galaxy Clusters | GAL 30B | Accuracy | 20 | # 4 |
|
|
Knowledge Probing | Galaxy Clusters | GAL 6.7B | Accuracy | 17.5 | # 5 |
|
|
Knowledge Probing | Galaxy Clusters | OPT (zero-shot) | Accuracy | 21.7 | # 2 |
|
|
Knowledge Probing | Galaxy Clusters | BLOOM (zero-shot) | Accuracy | 15 | # 6 |
|
|
Molecular Property Prediction | HIV dataset | GAL 1.3B | AUC | 0.724 | # 9 |
|
|
Molecular Property Prediction | HIV dataset | GAL 125M | AUC | 0.702 | # 11 |
|
|
Molecular Property Prediction | HIV dataset | GAL 30B | AUC | 0.759 | # 7 |
|
|
Molecular Property Prediction | HIV dataset | GAL 120B | AUC | 0.745 | # 8 |
|
|
Molecular Property Prediction | HIV dataset | GAL 6.7B | AUC | 0.722 | # 10 |
|
|
Molecular Property Prediction | HIV dataset | Uni-Mol | AUC | 0.808 | # 2 |
|
|
IUPAC Name Prediction | IUPAC | GAL 1.3B | Accuracy | 2.5 | # 4 |
|
|
IUPAC Name Prediction | IUPAC | GAL 120B | Accuracy | 39.2 | # 1 |
|
|
IUPAC Name Prediction | IUPAC | GAL 125M | Accuracy | 0 | # 5 |
|
|
IUPAC Name Prediction | IUPAC | GAL 30B | Accuracy | 15.4 | # 2 |
|
|
IUPAC Name Prediction | IUPAC | GAL 6.7B | Accuracy | 10.7 | # 3 |
|
|
Knowledge Probing | Latex Equations | BLOOM (zero-shot) | Accuracy | 21.4 | # 5 |
|
|
Knowledge Probing | Latex Equations | GAL 6.7B (zero-shot) | Accuracy | 41.7 | # 4 |
|
|
Knowledge Probing | Latex Equations | GAL 1.3B (zero-shot) | Accuracy | 20.5 | # 6 |
|
|
Knowledge Probing | Latex Equations | GAL 30B (zero-shot) | Accuracy | 51.5 | # 2 |
|
|
Knowledge Probing | Latex Equations | GAL 120B (zero-shot) | Accuracy | 68.2 | # 1 |
|
|
Knowledge Probing | Latex Equations | GAL 125M (zero-shot) | Accuracy | 0.5 | # 8 |
|
|
Knowledge Probing | Latex Equations | OPT (zero-shot) | Accuracy | 8.9 | # 7 |
|
|
Knowledge Probing | Latex Equations | GPT-3 (text-davinci-002) (zero-shot) | Accuracy | 49 | # 3 |
|
|
Math Word Problem Solving | MATH | GAL 120B (5-shot) mCoT | Accuracy | 20.4 | # 101 |
|
|
Math Word Problem Solving | MATH | GAL 120B (5-shot) mCoT | Parameters (Billions) | 120 | # 8 |
|
|
Math Word Problem Solving | MATH | GAL 30B <work> | Accuracy | 11.4 | # 112 |
|
|
Math Word Problem Solving | MATH | GAL 30B <work> | Parameters (Billions) | 30 | # 42 |
|
|
Math Word Problem Solving | MATH | GAL 120B <work> | Accuracy | 16.6 | # 105 |
|
|
Math Word Problem Solving | MATH | GAL 120B <work> | Parameters (Billions) | 120 | # 8 |
|
|
Math Word Problem Solving | MATH | GPT-3 175B (8-shot) | Accuracy | 5.2 | # 126 |
|
|
Math Word Problem Solving | MATH | GPT-3 175B (8-shot) | Parameters (Billions) | 175 | # 5 |
|
|
Math Word Problem Solving | MATH | GAL 30B (5-shot) mCoT | Accuracy | 12.7 | # 110 |
|
|
Math Word Problem Solving | MATH | GAL 30B (5-shot) mCoT | Parameters (Billions) | 30 | # 42 |
|
|
Math Word Problem Solving | MATH | PaLM 540B (5-shot) mCoT | Accuracy | 8.8 | # 115 |
|
|
Math Word Problem Solving | MATH | PaLM 540B (5-shot) mCoT | Parameters (Billions) | 540 | # 1 |
|
|
Math Word Problem Solving | MATH | Minerva 540B (5-shot) mCoT | Accuracy | 33.6 | # 80 |
|
|
Math Word Problem Solving | MATH | Minerva 540B (5-shot) mCoT | Parameters (Billions) | 540 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MedMCQA | GAL 120B (zero-shot) | Dev Set (Acc-%) | 0.529 | # 8 |
|
|
Multiple Choice Question Answering (MCQA) | MedMCQA | BLOOM (few-shot, k=5) | Dev Set (Acc-%) | 0.325 | # 16 |
|
|
Multiple Choice Question Answering (MCQA) | MedMCQA | OPT (few-shot, k=5) | Dev Set (Acc-%) | 0.296 | # 17 |
|
|
Question Answering | MedQA | OPT (few-shot, k=5) | Accuracy | 22.8 | # 26 |
|
|
Question Answering | MedQA | GAL 120B (zero-shot) | Accuracy | 44.4 | # 19 |
|
|
Question Answering | MedQA | BLOOM (few-shot, k=5) | Accuracy | 23.3 | # 25 |
|
|
Knowledge Probing | Mineral Groups | GAL 30B | Accuracy | 17.5 | # 3 |
|
|
Knowledge Probing | Mineral Groups | GAL 1.3B | Accuracy | 10.3 | # 4 |
|
|
Knowledge Probing | Mineral Groups | BLOOM (zero-shot) | Accuracy | 10.3 | # 4 |
|
|
Knowledge Probing | Mineral Groups | GPT-3 (text-davinci-002) (zero-shot) | Accuracy | 18.3 | # 2 |
|
|
Knowledge Probing | Mineral Groups | OPT (zero-shot) | Accuracy | 1.6 | # 7 |
|
|
Knowledge Probing | Mineral Groups | GAL 120B | Accuracy | 29.4 | # 1 |
|
|
Knowledge Probing | Mineral Groups | GAL 125M | Accuracy | 0.0 | # 8 |
|
|
Knowledge Probing | Mineral Groups | GAL 6.7B | Accuracy | 8.7 | # 6 |
|
|
Multi-task Language Understanding | MMLU | GAL 120B (zero-shot) | Average (%) | 52.6 | # 69 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Abstract Algebra) | Gopher (few-shot, k=5) | Accuracy | 25 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Abstract Algebra) | GAL 30B (zero-shot) | Accuracy | 33.3 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Abstract Algebra) | GAL 120B (zero-shot) | Accuracy | 27 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Abstract Algebra) | OPT (few-shot, k=5) | Accuracy | 21 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Abstract Algebra) | Chinchilla (few-shot, k=5) | Accuracy | 31 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Astronomy) | Chinchilla (few-shot, k=5) | Accuracy | 73.0 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Astronomy) | Gopher (few-shot, k=5) | Accuracy | 65.8 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Astronomy) | BLOOM (few-shot, k=5) | Accuracy | 25.7 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Astronomy) | GAL 120B (zero-shot) | Accuracy | 65.1 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Astronomy) | OPT (few-shot, k=5) | Accuracy | 23.0 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Biology) | Gopher (few-shot, k=5) | Accuracy | 70.8 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Biology) | OPT (few-shot, k=5) | Accuracy | 30.6 | # 7 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Biology) | GAL 120B (zero-shot) | Accuracy | 68.8 | # 6 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Biology) | Chinchilla (few-shot, k=5) | Accuracy | 79.9 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Biology) | BLOOM (few-shot, k=5) | Accuracy | 28.5 | # 8 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Chemistry) | Gopher (few-shot, k=5) | Accuracy | 45 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Chemistry) | Chinchilla (few-shot, k=5) | Accuracy | 51 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Chemistry) | BLOOM (few-shot, k=5) | Accuracy | 19 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Chemistry) | GAL 120B (zero-shot) | Accuracy | 46 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Chemistry) | OPT (few-shot, k=5) | Accuracy | 30 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Computer Science) | BLOOM (few-shot, k=5) | Accuracy | 6.0 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Computer Science) | Chinchilla (few-shot, k=5) | Accuracy | 51.0 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Computer Science) | GAL 120B (zero-shot) | Accuracy | 49 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Computer Science) | OPT (few-shot, k=5) | Accuracy | 17.0 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Mathematics) | Chinchilla (few-shot, k=5) | Accuracy | 32 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Mathematics) | GAL 120B (zero-shot) | Accuracy | 43 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Mathematics) | BLOOM (few-shot, k=5) | Accuracy | 25 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Mathematics) | Gopher (few-shot, k=5) | Accuracy | 37 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Mathematics) | OPT (few-shot, k=5) | Accuracy | 33 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Physics) | OPT (few-shot, k=5) | Accuracy | 21.6 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Physics) | Chinchilla (few-shot, k=5) | Accuracy | 46.1 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Physics) | BLOOM (few-shot, k=5) | Accuracy | 18.6 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Physics) | Gopher (few-shot, k=5) | Accuracy | 34.3 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (College Physics) | GAL 120B (zero-shot) | Accuracy | 42.2 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Econometrics) | OPT (few-shot, k=5) | Accuracy | 21 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Econometrics) | Chinchilla (few-shot, k=5) | Accuracy | 38.6 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Econometrics) | GAL 120B (zero-shot) | Accuracy | 42.1 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Econometrics) | BLOOM (few-shot, k=5) | Accuracy | 23.7 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Econometrics) | Gopher (few-shot, k=5) | Accuracy | 43 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Electrical Engineer) | GAL 120B (zero-shot) | Accuracy | 62.8 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Electrical Engineer) | OPT (few-shot, k=5) | Accuracy | 36.6 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Electrical Engineer) | BLOOM (few-shot, k=5) | Accuracy | 32.4 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Electrical Engineer) | Gopher (few-shot, k=5) | Accuracy | 60 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Electrical Engineer) | Chinchilla (few-shot, k=5) | Accuracy | 62.1 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Elementary Mathematics) | Gopher (few-shot, k=5) | Accuracy | 33.6 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Elementary Mathematics) | GAL 120B (zero-shot) | Accuracy | 38.1 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Elementary Mathematics) | Chinchilla (few-shot, k=5) | Accuracy | 41.5 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Elementary Mathematics) | OPT (few-shot, k=5) | Accuracy | 25.7 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Elementary Mathematics) | BLOOM (few-shot, k=5) | Accuracy | 27.6 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Formal Logic) | GAL 120B (zero-shot) | Accuracy | 32.5 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Formal Logic) | Gopher (few-shot, k=5) | Accuracy | 35.7 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Formal Logic) | Chinchilla (few-shot, k=5) | Accuracy | 33.3 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Formal Logic) | BLOOM (few-shot, k=5) | Accuracy | 26.2 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Formal Logic) | OPT (few-shot, k=5) | Accuracy | 29.4 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Biology) | BLOOM (few-shot, k=5) | Accuracy | 29.4 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Biology) | Chinchilla (few-shot, k=5) | Accuracy | 80.3 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Biology) | GAL 120B (zero-shot) | Accuracy | 69.4 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Biology) | Gopher (few-shot, k=5) | Accuracy | 71.3 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Biology) | OPT (few-shot, k=5) | Accuracy | 27.7 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Chemistry) | OPT (few-shot, k=5) | Accuracy | 21.7 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Chemistry) | BLOOM (few-shot, k=5) | Accuracy | 23.2 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Chemistry) | Chinchilla (few-shot, k=5) | Accuracy | 58.1 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Chemistry) | GAL 120B (zero-shot) | Accuracy | 47.8 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Computer Science) | Gopher (few-shot, k=5) | Accuracy | 54 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Computer Science) | BLOOM (few-shot, k=5) | Accuracy | 25 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Computer Science) | Chinchilla (few-shot, k=5) | Accuracy | 58 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Computer Science) | GAL 120B (zero-shot) | Accuracy | 70 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Computer Science) | OPT (few-shot, k=5) | Accuracy | 30 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Mathematics) | Gopher (few-shot, k=5) | Accuracy | 23.7 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Mathematics) | Chinchilla (few-shot, k=5) | Accuracy | 31.9 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Mathematics) | BLOOM (few-shot, k=5) | Accuracy | 27 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Mathematics) | OPT (few-shot, k=5) | Accuracy | 24.4 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Mathematics) | GAL 120B (zero-shot) | Accuracy | 32.6 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Physics) | BLOOM (few-shot, k=5) | Accuracy | 25.2 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Physics) | Chinchilla (few-shot, k=5) | Accuracy | 36.4 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Physics) | OPT (few-shot, k=5) | Accuracy | 29.8 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Physics) | GAL 120B (zero-shot) | Accuracy | 33.8 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Statistics) | GAL 120B (zero-shot) | Accuracy | 41.2 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Statistics) | Gopher (few-shot, k=5) | Accuracy | 50 | # 2 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Statistics) | BLOOM (few-shot, k=5) | Accuracy | 19.4 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Statistics) | OPT (few-shot, k=5) | Accuracy | 43.5 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (High School Statistics) | Chinchilla (few-shot, k=5) | Accuracy | 58.8 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Machine Learning) | Chinchilla (few-shot, k=5) | Accuracy | 41.1 | # 1 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Machine Learning) | BLOOM (few-shot, k=5) | Accuracy | 25 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Machine Learning) | OPT (few-shot, k=5) | Accuracy | 28.6 | # 3 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Machine Learning) | GAL 120B (zero-shot) | Accuracy | 38.4 | # 2 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | OPT (5-shot) | Accuracy | 26.7 | # 11 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | BLOOM (5-shot) | Accuracy | 26.4 | # 12 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | GAL 1.3B | Accuracy | 27.1 | # 10 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | GAL 6.7B <work> | Accuracy | 28 | # 9 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | GAL 30B | Accuracy | 29.9 | # 7 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | GAL 120B | Accuracy | 35.8 | # 3 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | GAL 30B <work> | Accuracy | 37.1 | # 2 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | GAL 1.3B <work> | Accuracy | 24.6 | # 13 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | Gopher (5-shot) | Accuracy | 30.6 | # 6 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | GAL 120B <work> | Accuracy | 41.3 | # 1 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | Chinchilla (5-shot) | Accuracy | 35.7 | # 4 |
|
|
Mathematical Reasoning | MMLU (Mathematics) | GAL 6.7B | Accuracy | 29.2 | # 8 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Medical Genetics) | BLOOM (few-shot, k=5) | Accuracy | 36 | # 7 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Medical Genetics) | GAL 120B (zero-shot) | Accuracy | 68 | # 6 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Medical Genetics) | GAL 30B (zero-shot) | Accuracy | 70 | # 4 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Medical Genetics) | Chinchilla (few-shot, k=5) | Accuracy | 69 | # 5 |
|
|
Multiple Choice Question Answering (MCQA) | MMLU (Medical Genetics) | OPT (few-shot, k=5) | Accuracy | 35 | # 8 |
|
|
Molecular Property Prediction | MoleculeNet | Uni-Mol | AUC | 0.77 | # 1 |
|
|
Molecular Property Prediction | MoleculeNet | GAL 6.7B | AUC | 0.64 | # 3 |
|
|
Molecular Property Prediction | MoleculeNet | GAL 30B | AUC | 0.69 | # 2 |
|
|
Molecular Property Prediction | MoleculeNet | GAL 1.3B | AUC | 0.619 | # 4 |
|
|
Molecular Property Prediction | MoleculeNet | GAL 125M | AUC | 0.581 | # 5 |
|
|
Protein Structure Prediction | PaenSeq | GAL 1.3B | Validation perplexity | 12.53 | # 4 |
|
|
Protein Function Prediction | PaenSeq | GAL 6.7B | ROUGE-L | 0.137 | # 3 |
|
|
Protein Function Prediction | PaenSeq | GAL 1.3B | ROUGE-L | 0.084 | # 4 |
|
|
Protein Function Prediction | PaenSeq | GAL 125M | ROUGE-L | 0.073 | # 5 |
|
|
Protein Function Prediction | PaenSeq | GAL 30B | ROUGE-L | 0.196 | # 2 |
|
|
Protein Function Prediction | PaenSeq | GAL 120B | ROUGE-L | 0.272 | # 1 |
|
|
Protein Structure Prediction | PaenSeq | GAL 125M | Validation perplexity | 16.35 | # 5 |
|
|
Protein Structure Prediction | PaenSeq | GAL 6.7B | Validation perplexity | 7.76 | # 3 |
|
|
Protein Structure Prediction | PaenSeq | GAL 30B | Validation perplexity | 4.28 | # 2 |
|
|
Protein Structure Prediction | PaenSeq | GAL 120B | Validation perplexity | 3.14 | # 1 |
|
|
Protein Annotation | PaenSeq | GAL 125M | F1 score | 0.093 | # 5 |
|
|
Protein Annotation | PaenSeq | GAL 1.3B | F1 score | 0.26 | # 4 |
|
|
Protein Annotation | PaenSeq | GAL 6.7B | F1 score | 0.333 | # 3 |
|
|
Protein Annotation | PaenSeq | GAL 30B | F1 score | 0.426 | # 2 |
|
|
Protein Annotation | PaenSeq | GAL 120B | F1 score | 0.545 | # 1 |
|
|
Question Answering | PubMedQA | BLOOM (zero-shot) | Accuracy | 73.6 | # 19 |
|
|
Question Answering | PubMedQA | GAL 120B (zero-shot) | Accuracy | 77.6 | # 9 |
|
|
Question Answering | PubMedQA | OPT (zero-shot) | Accuracy | 70.2 | # 22 |
|
|
Citation Prediction | PWC Citations | Sparse Retriever | Accuracy | 30.9 | # 4 |
|
|
Citation Prediction | PWC Citations | GAL 120B | Accuracy | 51.9 | # 1 |
|
|
Citation Prediction | PWC Citations | GAL 30B | Accuracy | 44.7 | # 2 |
|
|
Citation Prediction | PWC Citations | GAL 1.3B | Accuracy | 18.5 | # 6 |
|
|
Citation Prediction | PWC Citations | GAL 125M | Accuracy | 7 | # 8 |
|
|
Citation Prediction | PWC Citations | Dense Retriever (fine-tuned) | Accuracy | 27.6 | # 5 |
|
|
Citation Prediction | PWC Citations | GAL 6.7B | Accuracy | 32 | # 3 |
|
|
Citation Prediction | PWC Citations | Dense Retriever | Accuracy | 16.4 | # 7 |
|
|
Molecular Property Prediction | SIDER | GAL 125M | ROC-AUC | 55.9 | # 15 |
|
|
Molecular Property Prediction | SIDER | GAL 120B | ROC-AUC | 63.2 | # 11 |
|
|
Molecular Property Prediction | SIDER | GAL 30B | ROC-AUC | 61.3 | # 13 |
|
|
Molecular Property Prediction | SIDER | GAL 1.3B | ROC-AUC | 54.0 | # 17 |
|
|
Molecular Property Prediction | SIDER | GAL 6.7B | ROC-AUC | 55.9 | # 15 |
|
|
Bias Detection | StereoSet | OPT 175B | ICAT Score | 60 | # 11 |
|
|
Bias Detection | StereoSet | OPT 175B | LMS | 74.8 | # 3 |
|
|
Bias Detection | StereoSet | OPT 175B | SS | 59.9 | # 2 |
|
|
Bias Detection | StereoSet | GAL 120B | ICAT Score | 65.6 | # 8 |
|
|
Bias Detection | StereoSet | GAL 120B | LMS | 75 | # 2 |
|
|
Bias Detection | StereoSet | GAL 120B | SS | 56.2 | # 3 |
|
|
Bias Detection | StereoSet | GPT-3 (text-davinci-002) | ICAT Score | 60.8 | # 10 |
|
|
Bias Detection | StereoSet | GPT-3 (text-davinci-002) | LMS | 77.6 | # 1 |
|
|
Bias Detection | StereoSet | GPT-3 (text-davinci-002) | SS | 60.8 | # 1 |
|
|
TDC ADMET Benchmarking Group | tdcommons | Galactica-GAL-1.3B | TDC.BBB_Martins | 0.604 | # 7 |
|
|
TDC ADMET Benchmarking Group | tdcommons | Galactica-GAL-6.7B | TDC.BBB_Martins | 0.535 | # 9 |
|
|
TDC ADMET Benchmarking Group | tdcommons | Galactica-GAL-30B | TDC.BBB_Martins | 0.596 | # 8 |
|
|
TDC ADMET Benchmarking Group | tdcommons | Galactica-GAL-120B | TDC.BBB_Martins | 0.661 | # 6 |
|
|
TDC ADMET Benchmarking Group | tdcommons | Galactica-GAL-125M | TDC.BBB_Martins | 0.393 | # 10 |
|
|
Molecular Property Prediction | Tox21 | Uni-Mol | ROC-AUC | 79.6 | # 3 |
|
|
Molecular Property Prediction | Tox21 | GAL 120B | ROC-AUC | 68.9 | # 13 |
|
|
Molecular Property Prediction | Tox21 | GAL 6.7B | ROC-AUC | 63.9 | # 15 |
|
|
Molecular Property Prediction | Tox21 | GAL 1.3B | ROC-AUC | 60.6 | # 16 |
|
|
Molecular Property Prediction | Tox21 | GAL 125M | ROC-AUC | 54.3 | # 17 |
|
|
Molecular Property Prediction | Tox21 | GAL 30B | ROC-AUC | 68.5 | # 14 |
|
|
Question Answering | TruthfulQA | GAL 120B | MC1 | 0.26 | # 10 |
|
|
Question Answering | TruthfulQA | GAL 30B | MC1 | 0.24 | # 12 |
|
|
Question Answering | TruthfulQA | GAL 6.7B | MC1 | 0.19 | # 20 |
|
|
Question Answering | TruthfulQA | GAL 1.3B | MC1 | 0.19 | # 20 |
|
|
Question Answering | TruthfulQA | GAL 125M | MC1 | 0.19 | # 20 |
|
|
Question Answering | TruthfulQA | OPT 175B | MC1 | 0.21 | # 17 |
|
|
Protein Annotation | UniProtSeq | GAL 30B | F1 score | 0.408 | # 2 |
|
|
Protein Annotation | UniProtSeq | GAL 120B | F1 score | 0.487 | # 1 |
|
|
Protein Annotation | UniProtSeq | GAL 1.3B | F1 score | 0.219 | # 4 |
|
|
Protein Annotation | UniProtSeq | GAL 6.7B | F1 score | 0.251 | # 3 |
|
|
Protein Structure Prediction | UniProtSeq | GAL 120B | Validation perplexity | 5.54 | # 1 |
|
|
Protein Structure Prediction | UniProtSeq | GAL 125M | Validation perplexity | 19.05 | # 5 |
|
|
Protein Function Prediction | UniProtSeq | GAL 30B | ROUGE-L | 0.186 | # 2 |
|
|
Protein Annotation | UniProtSeq | GAL 125M | F1 score | 0.152 | # 5 |
|
|
Protein Function Prediction | UniProtSeq | GAL 125M | ROUGE-L | 0.061 | # 5 |
|
|
Protein Structure Prediction | UniProtSeq | GAL 30B | Validation perplexity | 8.23 | # 2 |
|
|
Protein Structure Prediction | UniProtSeq | GAL 6.7B | Validation perplexity | 11.58 |