What It Takes to Achieve 100\% Condition Accuracy on WikiSQL

EMNLP 2018 · Semih Yavuz, Izzeddin Gur, Yu Su, Xifeng Yan ·

WikiSQL is a newly released dataset for studying the natural language sequence to SQL translation problem. The SQL queries in WikiSQL are simple: Each involves one relation and does not have any join operation. Despite of its simplicity, none of the publicly reported structured query generation models can achieve an accuracy beyond 62{\%}, which is still far from enough for practical use. In this paper, we ask two questions, {``}Why is the accuracy still low for such simple queries?{''} and {``}What does it take to achieve 100{\%} accuracy on WikiSQL?{''} To limit the scope of our study, we focus on the WHERE clause in SQL. The answers will help us gain insights about the directions we should explore in order to further improve the translation accuracy. We will then investigate alternative solutions to realize the potential ceiling performance on WikiSQL. Our proposed solution can reach up to 88.6{\%} condition accuracy on the WikiSQL dataset.

PDF Abstract