TUR2SQL: A Cross-Domain Turkish Dataset For Text-to-SQL

8th International Conference on Computer Science and Engineering (UBMK) 2023 · Ali Buğra Kanburoğlu, F. Boray Tek ·

The field of converting natural language into corresponding SQL queries using deep learning techniques has attracted significant attention in recent years. While existing Text-to-SQL datasets primarily focus on English and other languages such as Chinese, there is a lack of resources for the Turkish language. In this study, we introduce the first publicly available cross-domain Turkish Text-to-SQL dataset, named TUR2SQL. This dataset consists of 10,809 pairs of natural language statements and their corresponding SQL queries. We conducted experiments using SQLNet and ChatGPT on the TUR2SQL dataset. The experimental results show that SQLNet has limited performance and ChatGPT has superior performance on the dataset. We believe that TUR2SQL provides a foundation for further exploration and advancements in Turkish language-based Text-to-SQL research.

PDF

Code

Add Remove Mark official

alibugra/TUR2SQL

Tasks

Add Remove

Text-To-SQL

Datasets

Introduced in the Paper:

TUR2SQL

Results from the Paper

Add Remove

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

Focus

Edit Social Preview

TUR2SQL: A Cross-Domain Turkish Dataset For Text-to-SQL

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove