Search Results for author: Michel Simard

Found 12 papers, 0 papers with code

Like Chalk and Cheese? On the Effects of Translationese in MT Training

no code implementations MTSummit 2021 Samuel Larkin, Michel Simard, Rebecca Knowles

We revisit the topic of translation direction in the data used for training neural machine translation systems and focusing on a real-world scenario with known translation direction and imbalances in translation direction: the Canadian Hansard.

Machine Translation Translation

Refining an Almost Clean Translation Memory Helps Machine Translation

no code implementations AMTA 2022 Shivendra Bhardwa, David Alfonso-Hermelo, Philippe Langlais, Gabriel Bernier-Colborne, Cyril Goutte, Michel Simard

While recent studies have been dedicated to cleaning very noisy parallel corpora to improve Machine Translation training, we focus in this work on filtering a large and mostly clean Translation Memory.

Machine Translation Translation

Automatic Text Simplification of News Articles in the Context of Public Broadcasting

no code implementations26 Dec 2022 Diego Maupomé, Fanny Rancourt, Thomas Soulas, Alexandre Lachance, Marie-Jean Meurs, Desislava Aleksandrova, Olivier Brochu Dufour, Igor Pontes, Rémi Cardon, Michel Simard, Sowmya Vajjala

This report summarizes the work carried out by the authors during the Twelfth Montreal Industrial Problem Solving Workshop, held at Universit\'e de Montr\'eal in August 2022.

Text Simplification

Fully Unsupervised Crosslingual Semantic Textual Similarity Metric Based on BERT for Identifying Parallel Data

no code implementations CONLL 2019 Chi-kiu Lo, Michel Simard

With the advent of massively multilingual context representation models such as BERT, which are trained on the concatenation of non-parallel data from each language, we show that the deadlock around parallel resources can be broken.

Machine Translation Natural Language Understanding +3

Measuring sentence parallelism using Mahalanobis distances: The NRC unsupervised submissions to the WMT18 Parallel Corpus Filtering shared task

no code implementations WS 2018 Patrick Littell, Samuel Larkin, Darlene Stewart, Michel Simard, Cyril Goutte, Chi-kiu Lo

The WMT18 shared task on parallel corpus filtering (Koehn et al., 2018b) challenged teams to score sentence pairs from a large high-recall, low-precision web-scraped parallel corpus (Koehn et al., 2018a).

Anomaly Detection Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.