A BERT Baseline for the Natural Questions

24 Jan 2019Chris AlbertiKenton LeeMichael Collins

This technical note describes a new baseline for the Natural Questions. Our model is based on BERT and reduces the gap between the model F1 scores reported in the original dataset paper and the human upper bound by 30% and 50% relative for the long and short answer tasks respectively... (read more)

PDF Abstract

Evaluation results from the paper


Task Dataset Model Metric name Metric value Global rank Compare
Question Answering Natural Questions BERT-joint F1 (Long) 66.2 # 1
Question Answering Natural Questions BERT-joint F1 (Short) 52.1 # 1