no code implementations • RANLP 2017 • Mihaela Colhon, C{\u{a}}t{\u{a}}lina M{\u{a}}r{\u{a}}nduc, C{\u{a}}t{\u{a}}lin Mititelu
The UAIC-RoDia-DepTb is a balanced treebank, containing texts in non-standard language: 2, 575 chats sentences, old Romanian texts (a Gospel printed in 1648, a codex of laws printed in 1818, a novel written in 1910), regional popular poetry, legal texts, Romanian and foreign fiction, quotations.
no code implementations • RANLP 2017 • Victoria Bobicev, C{\u{a}}t{\u{a}}lina M{\u{a}}r{\u{a}}nduc, Cenel Augusto Perez
We try to preserve data about endangered idioms such as Aromanian, Meglenoromanian and Istroromanian dialects, and calculate the distance between different regional variants, including the language spoken in the Republic of Moldova.
no code implementations • RANLP 2017 • Ludmila Malahov, C{\u{a}}t{\u{a}}lina M{\u{a}}r{\u{a}}nduc, Alex Colesnicov, ru
But the first edition of the New Testament is written in Cyrillic letters.
Information Retrieval Optical Character Recognition (OCR) +1