1 code implementation • 20 Oct 2022 • Faheem Kirefu, Vivek Iyer, Pinzhen Chen, Laurie Burchell
For subtask 1 we explored the effects of constrained decoding on English and transliterated subwords in order to produce Hinglish.
2 code implementations • ACL 2020 • Marta Ba{\~n}{\'o}n, Pin-zhen Chen, Barry Haddow, Kenneth Heafield, Hieu Hoang, Miquel Espl{\`a}-Gomis, Mikel L. Forcada, Amir Kamran, Faheem Kirefu, Philipp Koehn, Sergio Ortiz Rojas, Leopoldo Pla Sempere, Gema Ram{\'\i}rez-S{\'a}nchez, Elsa Sarr{\'\i}as, Marek Strelec, Brian Thompson, William Waites, Dion Wiggins, Jaume Zaragoza
We report on methods to create the largest publicly available parallel corpora by crawling the web, using open source software.
1 code implementation • ACL 2020 • Pin-zhen Chen, Nikolay Bogoychev, Kenneth Heafield, Faheem Kirefu
We present a novel method to extract parallel sentences from two monolingual corpora, using neural machine translation.
2 code implementations • 27 Jan 2020 • Barry Haddow, Faheem Kirefu
Parallel text is required for building high-quality machine translation (MT) systems, as well as for other multilingual NLP applications.
no code implementations • WS 2019 • Rachel Bawden, Nikolay Bogoychev, Ulrich Germann, Roman Grundkiewicz, Faheem Kirefu, Antonio Valerio Miceli Barone, Alexandra Birch
For all translation directions, we created or used back-translations of monolingual data in the target language as additional synthetic training data.