This dataset contains around 218K sentences, with 1.5 million words, from 30 different books designed for Post-OCR text correction.
1 PAPER • NO BENCHMARKS YET