Search Results for author: Callum Booth

Found 1 papers, 0 papers with code

A Language Modelling Approach to Quality Assessment of OCR’ed Historical Text

no code implementations LREC 2022 Callum Booth, Robert Shoemaker, Robert Gaizauskas

We hypothesise and evaluate a language model-based approach for scoring the quality of OCR transcriptions in the British Library Newspapers (BLN) corpus parts 1 and 2, to identify the best quality OCR for use in further natural language processing tasks, with a wider view to link individual newspaper reports of crime in nineteenth-century London to the Digital Panopticon—a structured repository of criminal lives.

Language Modelling Optical Character Recognition (OCR)

Cannot find the paper you are looking for? You can Submit a new open access paper.