Chinese Spelling Correction Dataset for errors generated by pinyin IME (CSCD-IME), a dataset containing 40,000 annotated sentences from real posts of official media on Sina Weibo. It is designed to detect and correct spelling mistakes in Chinese texts.
Source: CSCD-IME: Correcting Spelling Errors Generated by Pinyin IMEPaper | Code | Results | Date | Stars |
---|