TASK |
DATASET |
MODEL |
METRIC NAME |
METRIC VALUE |
GLOBAL RANK |
REMOVE |
Question Answering
|
BoolQ
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
86.0
|
# 14
|
|
Linguistic Acceptability
|
CoLA
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
86.4%
|
# 5
|
|
Sentiment Analysis
|
CR
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
92.5
|
# 3
|
|
Sentiment Analysis
|
IMDb
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
96.1
|
# 5
|
|
Sentiment Analysis
|
MPQA
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
90.8
|
# 1
|
|
Sentiment Analysis
|
MR
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
92.5
|
# 2
|
|
Semantic Textual Similarity
|
MRPC
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
F1
|
91.0
|
# 8
|
|
Topic Classification
|
OS
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
95.1
|
# 1
|
|
Natural Language Inference
|
QNLI
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
94.5%
|
# 15
|
|
Paraphrase Identification
|
Quora Question Pairs
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
F1
|
89.2
|
# 2
|
|
Natural Language Inference
|
RTE
|
RoBERTa-large 355M + EFL + UCA
|
Accuracy
|
87.2%
|
# 21
|
|
Natural Language Inference
|
RTE
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
90.5%
|
# 15
|
|
Natural Language Inference
|
SNLI
|
EFL (Entailment as Few-shot Learner) + RoBERTa-large
|
% Test Accuracy
|
93.1
|
# 1
|
|
Natural Language Inference
|
SNLI
|
EFL (Entailment as Few-shot Learner) + RoBERTa-large
|
% Train Accuracy
|
?
|
# 74
|
|
Natural Language Inference
|
SNLI
|
EFL (Entailment as Few-shot Learner) + RoBERTa-large
|
Parameters
|
355m
|
# 4
|
|
Natural Language Inference
|
SNLI
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
% Test Accuracy
|
93.1
|
# 1
|
|
Natural Language Inference
|
SNLI
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Parameters
|
355
|
# 1
|
|
Sentiment Analysis
|
SST-2 Binary classification
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
96.9
|
# 8
|
|
Semantic Textual Similarity
|
STS Benchmark
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Pearson Correlation
|
0.918
|
# 11
|
|
Subjectivity Analysis
|
SUBJ
|
RoBERTa-large 355M + Entailment as Few-shot Learner
|
Accuracy
|
97.1
|
# 3
|
|