Chinese Spelling Correction Dataset for errors generated by pinyin IME (CSCD-IME), a dataset containing 40,000 annotated sentences from real posts of official media on Sina Weibo. It is designed to detect and correct spelling mistakes in Chinese texts.

Source: CSCD-IME: Correcting Spelling Errors Generated by Pinyin IME

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages