Search Results for author: Charangan Vasantharajan

Found 5 papers, 2 papers with code

Adapting the Tesseract Open-Source OCR Engine for Tamil and Sinhala Legacy Fonts and Creating a Parallel Corpus for Tamil-Sinhala-English

1 code implementation13 Sep 2021 Charangan Vasantharajan, Laksika Tharmalingam, Uthayasanker Thayasivam

Since Tamil and Sinhala are Low-Resource Languages, we improved the performance of Tesseract by employing LSTM-based training on more than 20 legacy fonts to recognize printed characters in these languages.

Optical Character Recognition (OCR)

Cannot find the paper you are looking for? You can Submit a new open access paper.