no code implementations • WMT (EMNLP) 2021 • Markus Freitag, Ricardo Rei, Nitika Mathur, Chi-kiu Lo, Craig Stewart, George Foster, Alon Lavie, Ondřej Bojar
Contrary to previous years’ editions, this year we acquired our own human ratings based on expert-based human evaluation via Multidimensional Quality Metrics (MQM).
no code implementations • WMT (EMNLP) 2020 • Chi-kiu Lo, Eric Joanis
The National Research Council of Canada’s team submissions to the parallel corpus filtering task at the Fifth Conference on Machine Translation are based on two key components: (1) iteratively refined statistical sentence alignments for extracting sentence pairs from document pairs and (2) a crosslingual semantic textual similarity metric based on a pretrained multilingual language model, XLM-RoBERTa, with bilingual mappings learnt from a minimal amount of clean parallel data for scoring the parallelism of the extracted sentence pairs.
no code implementations • WMT (EMNLP) 2020 • Chi-kiu Lo, Samuel Larkin
We present a study on using YiSi-2 with massive multilingual pretrained language models for machine translation (MT) reference-less evaluation.
no code implementations • WMT (EMNLP) 2020 • Chi-kiu Lo
We present an extended study on using pretrained language models and YiSi-1 for machine translation evaluation.
1 code implementation • 9 Nov 2024 • Hillary Dawkins, Isar Nejadgholi, Chi-kiu Lo
We assess the difficulty of gender resolution in literary-style dialogue settings and the influence of gender stereotypes.
no code implementations • EMNLP 2020 • Loïc Barrault, Magdalena Biesialska, Ondřej Bojar, Marta R. Costa-jussà, Christian Federmann, Yvette Graham, Roman Grundkiewicz, Barry Haddow, Matthias Huck, Eric Joanis, Tom Kocmi, Philipp Koehn, Chi-kiu Lo, Nikola Ljubešić, Christof Monz, Makoto Morishita, Masaaki Nagata, Toshiaki Nakazawa, Santanu Pal, Matt Post, Marcos Zampieri
In the news task, participants were asked to build machine translation systems for any of 11 language pairs, to be evaluated on test sets consisting mainly of news stories.
no code implementations • 11 May 2020 • Lane Schwartz, Francis Tyers, Lori Levin, Christo Kirov, Patrick Littell, Chi-kiu Lo, Emily Prud'hommeaux, Hyunji Hayley Park, Kenneth Steimel, Rebecca Knowles, Jeffrey Micher, Lonny Strunk, Han Liu, Coleman Haley, Katherine J. Zhang, Robbie Jimmerson, Vasilisa Andriyanets, Aldrian Obaja Muis, Naoki Otani, Jong Hyuk Park, Zhisong Zhang
In the literature, languages like Finnish or Turkish are held up as extreme examples of complexity that challenge common modelling assumptions.
no code implementations • LREC 2020 • Eric Joanis, Rebecca Knowles, Rol Kuhn, , Samuel Larkin, Patrick Littell, Chi-kiu Lo, Darlene Stewart, Jeffrey Micher
This paper describes a newly released sentence-aligned Inuktitut{--}English corpus based on the proceedings of the Legislative Assembly of Nunavut, covering sessions from April 1999 to June 2017.
no code implementations • CONLL 2019 • Chi-kiu Lo, Michel Simard
With the advent of massively multilingual context representation models such as BERT, which are trained on the concatenation of non-parallel data from each language, we show that the deadlock around parallel resources can be broken.
no code implementations • WS 2019 • Patrick Littell, Chi-kiu Lo, Samuel Larkin, Darlene Stewart
We describe the neural machine translation (NMT) system developed at the National Research Council of Canada (NRC) for the Kazakh-English news translation task of the Fourth Conference on Machine Translation (WMT19).
no code implementations • WS 2019 • Gabriel Bernier-Colborne, Chi-kiu Lo
We describe the National Research Council Canada team{'}s submissions to the parallel corpus filtering task at the Fourth Conference on Machine Translation.
no code implementations • WS 2019 • Chi-kiu Lo
We present YiSi, a unified automatic semantic machine translation quality evaluation and estimation metric for languages with different levels of available resources.
no code implementations • WS 2018 • Patrick Littell, Samuel Larkin, Darlene Stewart, Michel Simard, Cyril Goutte, Chi-kiu Lo
The WMT18 shared task on parallel corpus filtering (Koehn et al., 2018b) challenged teams to score sentence pairs from a large high-recall, low-precision web-scraped parallel corpus (Koehn et al., 2018a).
no code implementations • WS 2018 • Chi-kiu Lo, Michel Simard, Darlene Stewart, Samuel Larkin, Cyril Goutte, Patrick Littell
We present our semantic textual similarity approach in filtering a noisy web crawled parallel corpus using YiSi{---}a novel semantic machine translation evaluation metric.
no code implementations • LREC 2014 • Chi-kiu Lo, Dekai Wu
In this paper we focus on (1) the IAA on the semantic role alignment task and (2) the overall IAA of HMEANT.