no code implementations • 26 Apr 2019 • Jean-Baptiste Camps, Gilles Guilhem Couffignal
At a time when the quantity of - more or less freely - available data is increasing significantly, thanks to digital corpora, editions or libraries, the development of data mining tools or deep learning methods allows researchers to build a corpus of study tailored for their research, to enrich their data and to exploit them. Open optical character recognition (OCR) tools can be adapted to old prints, incunabula or even manuscripts, with usable results, allowing the rapid creation of textual corpora.