no code implementations • WS 2019 • Claudia Matos Veliz, Orphee De Clercq, Veronique Hoste
One of the most persistent characteristics of written user-generated content (UGC) is the use of non-standard words.
no code implementations • RANLP 2019 • Claudia Matos Veliz, Orphee De Clercq, Veronique Hoste
Regarding NMT, we find that the translations - or normalizations - coming out of this model are far from perfect and that for a low-resource language like Dutch adding additional training data works better than artificially augmenting the data.