no code implementations • JEP/TALN/RECITAL 2022 • Simon Gabay, Pedro Ortiz Suarez, Rachel Bawden, Alexandre Bartz, Philippe Gambette, Benoît Sagot
En dépit de leur qualité certaine, les ressources et outils disponibles pour l’analyse du français d’Ancien Régime ne sont plus à même de répondre aux enjeux de la recherche en linguistique et en littérature pour cette période.
1 code implementation • WMT (EMNLP) 2020 • Rachel Bawden, Giorgio Maria Di Nunzio, Cristian Grozea, Inigo Jauregi Unanue, Antonio Jimeno Yepes, Nancy Mah, David Martinez, Aurélie Névéol, Mariana Neves, Maite Oronoz, Olatz Perez-de-Viñaspre, Massimo Piccardi, Roland Roller, Amy Siu, Philippe Thomas, Federica Vezzani, Maika Vicente Navarro, Dina Wiemann, Lana Yeganova
Machine translation of scientific abstracts and terminologies has the potential to support health professionals and biomedical researchers in some of their activities.
no code implementations • WMT (EMNLP) 2020 • Nikita Moghe, Christian Hardmeier, Rachel Bawden
Our baseline systems are transformer-big models that are pre-trained on the WMT’19 News Translation task and fine-tuned on pseudo-in-domain web crawled data and in-domain task data.
no code implementations • WMT (EMNLP) 2020 • Rachel Bawden, Alexandra Birch, Radina Dobreva, Arturo Oncevay, Antonio Valerio Miceli Barone, Philip Williams
We describe the University of Edinburgh’s submissions to the WMT20 news translation shared task for the low resource language pair English-Tamil and the mid-resource language pair English-Inuktitut.
no code implementations • EAMT 2020 • António Lopes, M. Amin Farajian, Rachel Bawden, Michael Zhang, André F. T. Martins
In this paper we provide a systematic comparison of existing and new document-level neural machine translation solutions.
no code implementations • WMT (EMNLP) 2021 • Lana Yeganova, Dina Wiemann, Mariana Neves, Federica Vezzani, Amy Siu, Inigo Jauregi Unanue, Maite Oronoz, Nancy Mah, Aurélie Névéol, David Martinez, Rachel Bawden, Giorgio Maria Di Nunzio, Roland Roller, Philippe Thomas, Cristian Grozea, Olatz Perez-de-Viñaspre, Maika Vicente Navarro, Antonio Jimeno Yepes
In the sixth edition of the WMT Biomedical Task, we addressed a total of eight language pairs, namely English/German, English/French, English/Spanish, English/Portuguese, English/Chinese, English/Russian, English/Italian, and English/Basque.
no code implementations • WMT (EMNLP) 2020 • Rachel Bawden, Biao Zhang, Andre Tättar, Matt Post
We describe parBLEU, parCHRF++, and parESIM, which augment baseline metrics with automatically generated paraphrases produced by PRISM (Thompson and Post, 2020a), a multilingual neural machine translation system.
no code implementations • 24 May 2022 • Yu Lu Liu, Rachel Bawden, Thomas Scaliom, Benoît Sagot, Jackie Chi Kit Cheung
In text summarization and simplification, system outputs must be evaluated along multiple dimensions such as relevance, factual consistency, fluency, and grammaticality, and a wide range of possible outputs could be of high quality.
no code implementations • 18 Feb 2022 • Simon Gabay, Pedro Ortiz Suarez, Alexandre Bartz, Alix Chagué, Rachel Bawden, Philippe Gambette, Benoît Sagot
Because these historical states are at the same time more complex to process and more scarce in the corpora available, specific efforts are necessary to train natural language processing (NLP) tools adapted to the data.
4 code implementations • ICLR 2022 • Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Tali Bers, Stella Biderman, Leo Gao, Thomas Wolf, Alexander M. Rush
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020).
no code implementations • 1 Sep 2021 • Barry Haddow, Rachel Bawden, Antonio Valerio Miceli Barone, Jindřich Helcl, Alexandra Birch
We present a survey covering the state of the art in low-resource machine translation research.
1 code implementation • EACL 2021 • Farid Arthaud, Rachel Bawden, Alexandra Birch
Machine translation (MT) models used in industries with constantly changing topics, such as translation or news agencies, need to adapt to new data to maintain their performance over time.
no code implementations • LREC 2020 • Susie Coleman, Andrew Secker, Rachel Bawden, Barry Haddow, Alex Birch, ra
A growth in news sources makes this increasingly challenging and time-consuming but MT can help automate some aspects of this process.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Rachel Bawden, Biao Zhang, Lisa Yankovskaya, Andre Tättar, Matt Post
We investigate a long-perceived shortcoming in the typical use of BLEU: its reliance on a single reference.
1 code implementation • LREC 2020 • Radina Dobreva, Jie zhou, Rachel Bawden
Current approaches to machine translation (MT) either translate sentences in isolation, disregarding the context they appear in, or model context at the level of the full document, without a notion of any internal structure the document may have.
no code implementations • WS 2019 • Rachel Bawden, Kevin Bretonnel Cohen, Cristian Grozea, Antonio Jimeno Yepes, Madeleine Kittner, Martin Krallinger, Nancy Mah, Aurelie Neveol, Mariana Neves, Felipe Soares, Amy Siu, Karin Verspoor, Maika Vicente Navarro
In the fourth edition of the WMT Biomedical Translation task, we considered a total of six languages, namely Chinese (zh), English (en), French (fr), German (de), Portuguese (pt), and Spanish (es).
no code implementations • WS 2019 • Alex Birch, ra, Barry Haddow, Ivan Tito, Antonio Valerio Miceli Barone, Rachel Bawden, Felipe S{\'a}nchez-Mart{\'\i}nez, Mikel L. Forcada, Miquel Espl{\`a}-Gomis, V{\'\i}ctor S{\'a}nchez-Cartagena, Juan Antonio P{\'e}rez-Ortiz, Wilker Aziz, Andrew Secker, Peggy van der Kreeft
no code implementations • WS 2019 • Rachel Bawden, Nikolay Bogoychev, Ulrich Germann, Roman Grundkiewicz, Faheem Kirefu, Antonio Valerio Miceli Barone, Alexandra Birch
For all translation directions, we created or used back-translations of monolingual data in the target language as additional synthetic training data.
2 code implementations • 30 May 2019 • Rachel Bawden, Sophie Rosset, Thomas Lavergne, Eric Bilinski
We provide a preliminary analysis of the corpus to confirm that the participants' judgments reveal perceptible differences in MT quality between the two MT systems used.
no code implementations • JEPTALNRECITAL 2018 • Rachel Bawden, Thomas Lavergne, Sophie Rosset
In this article, we provide several approaches to the automatic identification of parallel sentences that require sentence-external linguistic context to be correctly translated.
no code implementations • NAACL 2018 • Rachel Bawden, Rico Sennrich, Alexandra Birch, Barry Haddow
Despite gains using BLEU, multi-encoder models give limited improvement in the handling of discourse phenomena: 50% accuracy on our coreference test set and 53. 5% for coherence/cohesion (compared to a non-contextual baseline of 50%).
no code implementations • EMNLP 2017 • Rachel Bawden
In this paper, we address the problem of generating English tag questions (TQs) (e. g. it is, isn{'}t it?)
no code implementations • JEPTALNRECITAL 2017 • Rachel Bawden
Whilst the focus of Machine Translation (MT) has for a long time been the translation of planned, written texts, more and more research is being dedicated to translating speech-like texts (informal or spontaneous discourse or dialogue).
no code implementations • COLING 2016 • Rachel Bawden, Beno{\^\i}t Crabb{\'e}
We present an efficient model selection method using boosting for transition-based constituency parsing.
no code implementations • JEPTALNRECITAL 2016 • Rachel Bawden, Guillaume Wisniewski, H{\'e}l{\`e}ne Maynard
In this paper we investigate the impact of the integration of context into dialogue translation.
no code implementations • LREC 2014 • Rachel Bawden, Marie-Am{\'e}lie Botalla, Kim Gerdes, Sylvain Kahane
The micro-syntactic annotation process, presented in this paper, includes a semi-automatic preparation of the transcription, the application of a syntactic dependency parser, transcoding of the parsing results to the Rhapsodie annotation scheme, manual correction by multiple annotators followed by a validation process, and finally the application of coherence rules that check common errors.