Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task

EMNLP 2018 Tao YuRui ZhangKai YangMichihiro YasunagaDongxu WangZifan LiJames MaIrene LiQingning YaoShanelle RomanZilin ZhangDragomir Radev

We present Spider, a large-scale, complex and cross-domain semantic parsing and text-to-SQL dataset annotated by 11 college students. It consists of 10,181 questions and 5,693 unique complex SQL queries on 200 databases with multiple tables, covering 138 different domains... (read more)

Evaluation results from the paper

Task Dataset Model Metric name Metric value Global rank Compare
Semantic Parsing spider Exact Set Matching Accuracy 19.7 # 1