no code implementations • LREC 2020 • Arya D. McCarthy, Rachel Wicks, Dylan Lewis, Aaron Mueller, Winston Wu, Oliver Adams, Garrett Nicolai, Matt Post, David Yarowsky
The corpus consists of over 4000 unique translations of the Christian Bible and counting.
no code implementations • LREC 2020 • Garrett Nicolai, Dylan Lewis, Arya D. McCarthy, Aaron Mueller, Winston Wu, David Yarowsky
Exploiting the broad translation of the Bible into the world{'}s languages, we train and distribute morphosyntactic tools for approximately one thousand languages, vastly outstripping previous distributions of tools devoted to the processing of inflectional morphology.
no code implementations • LREC 2020 • Aaron Mueller, Garrett Nicolai, Arya D. McCarthy, Dylan Lewis, Winston Wu, David Yarowsky
We find that best practices in this domain are highly language-specific: adding more languages to a training set is often better, but too many harms performance{---}the best number depends on the source language.
no code implementations • COLING 2020 • Dylan Lewis, Winston Wu, Arya D. McCarthy, David Yarowsky
We present a method for completing multilingual translation dictionaries.