KorSTS is a dataset for semantic textural similarity (STS) in Korean. The dataset is constructed by automatically the STS-B dataset. To ensure translation quality, two professional translators with at least seven years of experience who specialize in academic papers/books as well as business contracts post-edited a half of the dataset each and cross-checked each other’s translation afterward. The KorSTS dataset comprises 5,749 training examples translated automatically and 2,879 evaluation examples translated manually.

Source: https://github.com/kakaobrain/KorNLUDatasets

Papers


Paper Code Results Date Stars

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages