Dataset for Post-OCR text correction in Sanskrit

Introduced by Maheshwari et al. in A Benchmark and Dataset for Post-OCR text correction in Sanskrit

This dataset contains around 218K sentences, with 1.5 million words, from 30 different books designed for Post-OCR text correction.

Source: A Benchmark and Dataset for Post-OCR text correction in Sanskrit

Homepage

No benchmarks yet. Start a new benchmark or link an existing one.

Paper	Code	Results	Date	Stars

No data loaders found. You can submit your data loader here.

Source: https://arxiv.org/pdf/2211.07980v1.pdf.