TASK
DATASET
MODEL
METRIC NAME
METRIC VALUE
GLOBAL RANK
EXTRA DATA
REMOVE
Natural Language Inference
ANLI test
GPT-3
A1
36.8
# 8
Natural Language Inference
ANLI test
GPT-3
A2
34
# 16
Natural Language Inference
ANLI test
GPT-3
A3
40.2
# 14
Common Sense Reasoning
ARC (Challenge)
GPT-3 175B (1 shot)
Accuracy
53.2
# 17
Common Sense Reasoning
ARC (Challenge)
GPT-3 175B (0 shot)
Accuracy
51.4
# 19
Common Sense Reasoning
ARC (Easy)
GPT-3 175B (0 shot)
Accuracy
68.8
# 18
Common Sense Reasoning
ARC (Easy)
GPT-3 175B (1 shot)
Accuracy
71.2
# 14
Question Answering
BoolQ
GPT-3 (zero-shot)
Accuracy
60.5
# 32
Question Answering
BoolQ
GPT-3 175B (few-shot)
Accuracy
76.4
# 20
Natural Language Inference
CommitmentBank
GPT-3 175B (Few-Shot)
F1
52
# 5
Natural Language Inference
CommitmentBank
GPT-3 175B (Few-Shot)
Accuracy
75.6
# 7
Zero-Shot Learning
COPA
GPT-3
Accuracy
73.0
# 4
Question Answering
COPA
GPT-3 175B (Few-Shot)
Accuracy
92
# 6
Question Answering
CoQA
GPT-3 175B (Few-Shot)
Overall
85
# 1
Question Answering
DROP Test
GPT-3 175B (few-Shot)
F1
36.5
# 13
Sentence Completion
HellaSwag
GPT-3 (zero-shot)
Accuracy
78.9
# 18
Zero-Shot Learning
HellaSwag
GPT-3
Accuracy
51.0
# 1
Sentence Completion
HellaSwag
GPT-3 175B (Few-Shot)
Accuracy
79.3
# 15
Language Modelling
LAMBADA
GPT-3 2.7B (Zero-Shot)
Accuracy
67.1
# 23
Language Modelling
LAMBADA
GPT-3 2.7B (Zero-Shot)
Perplexity
4.60
# 6
Language Modelling
LAMBADA
GPT-3 6.7B (Zero-Shot)
Accuracy
70.3
# 20
Language Modelling
LAMBADA
GPT-3 6.7B (Zero-Shot)
Perplexity
4.00
# 5
Language Modelling
LAMBADA
GPT-3 175B (Zero-Shot)
Accuracy
76.2
# 16
Language Modelling
LAMBADA
GPT-3 175B (Zero-Shot)
Perplexity
3.00
# 2
Language Modelling
LAMBADA
GPT-3 13B (Zero-Shot)
Accuracy
72.5
# 18
Language Modelling
LAMBADA
GPT-3 13B (Zero-Shot)
Perplexity
3.56
# 3
Language Modelling
LAMBADA
GPT-3 175B (Few-Shot)
Accuracy
86.4
# 4
Language Modelling
LAMBADA
GPT-3 175B (Few-Shot)
Perplexity
1.92
# 1
Multi-task Language Understanding
MMLU
GPT-3 (fine-tuned)
Humanities
52.5
# 9
Multi-task Language Understanding
MMLU
GPT-3 (fine-tuned)
Average (%)
53.9
# 26
Multi-task Language Understanding
MMLU
GPT-3 (fine-tuned)
Parameters (Billions)
175
# 36
Multi-task Language Understanding
MMLU
GPT-3 (fine-tuned)
STEM
41.4
# 15
Multi-task Language Understanding
MMLU
GPT-3 (fine-tuned)
Social Sciences
63.9
# 9
Multi-task Language Understanding
MMLU
GPT-3 (fine-tuned)
Other
57.9
# 9
Multi-task Language Understanding
MMLU
GPT-3 (fine-tuned)
Tokens (Billions)
300
# 7
Multi-task Language Understanding
MMLU
GPT-3 175B (few-shot, k=5)
Humanities
40.8
# 14
Multi-task Language Understanding
MMLU
GPT-3 175B (few-shot, k=5)
Parameters (Billions)
175
# 36
Multi-task Language Understanding
MMLU
GPT-3 175B (few-shot, k=5)
Social Sciences
50.4
# 13
Multi-task Language Understanding
MMLU
GPT-3 2.7B (few-shot, k=5)
Humanities
24.4
# 32
Multi-task Language Understanding
MMLU
GPT-3 2.7B (few-shot, k=5)
Average (%)
25.9
# 58
Multi-task Language Understanding
MMLU
GPT-3 2.7B (few-shot, k=5)
Parameters (Billions)
2.7
# 6
Multi-task Language Understanding
MMLU
GPT-3 2.7B (few-shot, k=5)
STEM
26.0
# 35
Multi-task Language Understanding
MMLU
GPT-3 2.7B (few-shot, k=5)
Social Sciences
30.9
# 23
Multi-task Language Understanding
MMLU
GPT-3 2.7B (few-shot, k=5)
Other
24.1
# 31
Multi-task Language Understanding
MMLU
GPT-3 175B (few-shot, k=5)
Average (%)
43.9
# 37
Multi-task Language Understanding
MMLU
GPT-3 175B (few-shot, k=5)
STEM
36.7
# 20
Multi-task Language Understanding
MMLU
GPT-3 175B (few-shot, k=5)
Other
48.8
# 13
Multi-task Language Understanding
MMLU
GPT-3 6.7B (fine-tuned)
Humanities
42.1
# 13
Multi-task Language Understanding
MMLU
GPT-3 6.7B (fine-tuned)
Average (%)
43.2
# 38
Multi-task Language Understanding
MMLU
GPT-3 6.7B (fine-tuned)
Parameters (Billions)
6.7
# 9
Multi-task Language Understanding
MMLU
GPT-3 6.7B (fine-tuned)
STEM
35.1
# 24
Multi-task Language Understanding
MMLU
GPT-3 6.7B (fine-tuned)
Social Sciences
49.2
# 14
Multi-task Language Understanding
MMLU
GPT-3 6.7B (fine-tuned)
Other
46.9
# 14
Multi-task Language Understanding
MMLU
GPT-3 13B (few-shot, k=5)
Humanities
27.1
# 28
Multi-task Language Understanding
MMLU
GPT-3 13B (few-shot, k=5)
Average (%)
26
# 57
Multi-task Language Understanding
MMLU
GPT-3 13B (few-shot, k=5)
Parameters (Billions)
13
# 20
Multi-task Language Understanding
MMLU
GPT-3 13B (few-shot, k=5)
STEM
24.3
# 40
Multi-task Language Understanding
MMLU
GPT-3 13B (few-shot, k=5)
Social Sciences
25.6
# 29
Multi-task Language Understanding
MMLU
GPT-3 13B (few-shot, k=5)
Other
26.5
# 27
Multi-task Language Understanding
MMLU
GPT-3 6.7B (few-shot, k=5)
Humanities
26.1
# 30
Multi-task Language Understanding
MMLU
GPT-3 6.7B (few-shot, k=5)
Average (%)
24.9
# 61
Multi-task Language Understanding
MMLU
GPT-3 6.7B (few-shot, k=5)
Parameters (Billions)
6.7
# 9
Multi-task Language Understanding
MMLU
GPT-3 6.7B (few-shot, k=5)
STEM
25.6
# 38
Multi-task Language Understanding
MMLU
GPT-3 6.7B (few-shot, k=5)
Social Sciences
21.6
# 32
Multi-task Language Understanding
MMLU
GPT-3 6.7B (few-shot, k=5)
Other
25.5
# 28
Question Answering
MultiRC
GPT-3 175B (Few-Shot)
F1
75.4
# 8
Question Answering
Natural Questions
GPT-3 175B (Few-Shot, k=64)
EM
29.9
# 25
Question Answering
OBQA
GPT-3 175B (zero-shot)
Accuracy
57.6
# 4
Question Answering
OpenBookQA
GPT-3 175B (Few-Shot)
Accuracy
65.4
# 9
Language Modelling
Penn Treebank (Word Level)
GPT-3 (Zero-Shot)
Test perplexity
20.5
# 1
Language Modelling
Penn Treebank (Word Level)
GPT-3 (Zero-Shot)
Params
175000M
# 1
Zero-Shot Learning
PIQA
GPT-3
Accuracy
72.9
# 2
Question Answering
PIQA
GPT-3 175B (zero-shot)
Accuracy
81.0
# 10
Question Answering
QuAC
GPT-3 175B (Few-Shot)
F1
44.3
# 2
Question Answering
RACE
GPT-3 175B (Few-Shot)
RACE-m
58.1
# 6
Question Answering
RACE
GPT-3 175B (Few-Shot)
RACE-h
46.8
# 5
Reading Comprehension
RACE
GPT-3 175B (zero-shot)
Accuracy (High)
45.5
# 13
Reading Comprehension
RACE
GPT-3 175B (zero-shot)
Accuracy (Middle)
58.4
# 13
Zero-Shot Learning
ReCoRD
GPT-3
Accuracy
82.1
# 1
Natural Language Inference
RTE
GPT-3 175B (Few-Shot)
Accuracy
69%
# 38
Zero-Shot Learning
Story Cloze
GPT-3
Accuracy
72.4
# 2
Question Answering
Story Cloze
GPT-3 175B (Few-Shot)
Accuracy
87.7
# 4
Language Modelling
The Pile
GPT-3 (Zero-Shot)
Bits per byte
0.7177
# 3
Question Answering
TriviaQA
GPT-3 175B (Few-Shot)
EM
71.2
# 18
Question Answering
WebQuestions
GPT-3-175B (Few-Shot)
EM
41.5
# 8
Question Answering
WebQuestions
GPT-3-175B (One-Shot)
EM
25.3
# 13
Question Answering
WebQuestions
GPT-3-175B (Zero-Shot)
EM
14.4
# 17
Coreference Resolution
Winograd Schema Challenge
GPT-3 175B (Few-Shot)
Accuracy
80.1
# 3
Common Sense Reasoning
WinoGrande
GPT-3 175B (zero-shot)
Accuracy
70.2
# 13
Zero-Shot Learning
WinoGrande
GPT-3
Accuracy
57.4
# 1
Unsupervised Machine Translation
WMT2014 English-French
GPT-3 175B (Few-Shot)
BLEU
32.6
# 5
Unsupervised Machine Translation
WMT2014 French-English
GPT-3 175B (Few-Shot)
BLEU
39.2
# 1
Unsupervised Machine Translation
WMT2016 English-German
GPT-3 175B (Few-Shot)
BLEU
29.7
# 1
Unsupervised Machine Translation
WMT2016 English-Romanian
GPT-3 175B (Few-Shot)
BLEU
21
# 1
Unsupervised Machine Translation
WMT2016 German-English
GPT-3 175B (Few-Shot)
BLEU
40.6
# 1
Unsupervised Machine Translation
WMT2016 Romanian-English
GPT-3 175B (Few-Shot)
BLEU
39.5
# 1
Word Sense Disambiguation
Words in Context
GPT-3 175B (Few-Shot)
Accuracy
49.4
# 11
Coreference Resolution
WSC
GPT-3 175B (Few-Shot)
Accuracy
80.1
# 6