no code implementations • EACL 2021 • Jonne Sälevä, Constantine Lignos
This paper evaluates the performance of several modern subword segmentation methods in a low-resource neural machine translation setting.
no code implementations • Findings (ACL) 2022 • Constantine Lignos, Nolan Holley, Chester Palen-Michel, Jonne Sälevä
We then discuss the importance of creating annotation for lower-resourced languages in a thoughtful and ethical way that includes the languages' speakers as part of the development process.
no code implementations • 4 May 2023 • Jonne Sälevä, Constantine Lignos
We introduce three simple randomized variants of byte pair encoding (BPE) and explore whether randomizing the selection of merge operations substantially affects a downstream machine translation task.
1 code implementation • 1 Apr 2021 • Jonne Sälevä, Constantine Lignos
This work supports further development of language technology for the languages of Africa by providing a Wikidata-derived resource of name lists corresponding to common entity types (person, location, and organization).
1 code implementation • NAACL (SIGTYP) 2022 • Jonne Sälevä, Constantine Lignos
We demonstrate an application of ParaNames by training a multilingual model for canonical name translation to and from English.