no code implementations • EACL (VarDial) 2021 • Gabriel Bernier-Colborne, Serge Leger, Cyril Goutte
We describe the systems developed by the National Research Council Canada for the Uralic language identification shared task at the 2021 VarDial evaluation campaign.
no code implementations • VarDial (COLING) 2020 • Gabriel Bernier-Colborne, Cyril Goutte
We describe the systems developed by the National Research Council Canada for the Uralic language identification shared task at the 2020 VarDial evaluation campaign.
no code implementations • VarDial (COLING) 2022 • Gabriel Bernier-Colborne, Serge Leger, Cyril Goutte
We describe the systems developed by the National Research Council Canada for the French Cross-Domain Dialect Identification shared task at the 2022 VarDial evaluation campaign.
no code implementations • AMTA 2022 • Shivendra Bhardwa, David Alfonso-Hermelo, Philippe Langlais, Gabriel Bernier-Colborne, Cyril Goutte, Michel Simard
While recent studies have been dedicated to cleaning very noisy parallel corpora to improve Machine Translation training, we focus in this work on filtering a large and mostly clean Translation Memory.
no code implementations • COLING 2020 • Shivendra Bhardwaj, David Alfonso Hermelo, Phillippe Langlais, Gabriel Bernier-Colborne, Cyril Goutte, Michel Simard
Deep neural models tremendously improved machine translation.
no code implementations • WS 2019 • Gabriel Bernier-Colborne, Cyril Goutte, Serge L{\'e}ger
We describe the systems developed by the National Research Council Canada for the Cuneiform Language Identification (CLI) shared task at the 2019 VarDial evaluation campaign.
no code implementations • WS 2018 • Chi-kiu Lo, Michel Simard, Darlene Stewart, Samuel Larkin, Cyril Goutte, Patrick Littell
We present our semantic textual similarity approach in filtering a noisy web crawled parallel corpus using YiSi{---}a novel semantic machine translation evaluation metric.
no code implementations • WS 2018 • Patrick Littell, Samuel Larkin, Darlene Stewart, Michel Simard, Cyril Goutte, Chi-kiu Lo
The WMT18 shared task on parallel corpus filtering (Koehn et al., 2018b) challenged teams to score sentence pairs from a large high-recall, low-precision web-scraped parallel corpus (Koehn et al., 2018a).
no code implementations • COLING 2018 • Yunli Wang, Cyril Goutte
Detecting changes within an unfolding event in real time from news articles or social media enables to react promptly to serious issues in public safety, public health or natural disasters.
no code implementations • WS 2017 • Cyril Goutte, Serge L{\'e}ger
We mainly explored the use of voting, and various ways to optimize the choice and number of voting systems.
no code implementations • WS 2017 • Yunli Wang, Cyril Goutte
Detecting events from social media data has important applications in public security, political issues, and public health.
no code implementations • COLING 2016 • Yunli Wang, Yong Jin, Xiaodan Zhu, Cyril Goutte
We show that such knowledge can be used to construct better discriminative keyphrase extraction systems that do not assume a static, fixed set of keyphrases for a document.
no code implementations • WS 2016 • Cyril Goutte, Serge L{\'e}ger
We describe the systems entered by the National Research Council in the 2016 shared task on discriminating similar languages.
no code implementations • LREC 2016 • Cyril Goutte, Serge Léger, Shervin Malmasi, Marcos Zampieri
We present an analysis of the performance of machine learning classifiers on discriminating between similar languages and language varieties.
no code implementations • NeurIPS 2009 • Massih Amini, Nicolas Usunier, Cyril Goutte
We assume the existence of view generating functions which may complete the missing views in an approximate way.