no code implementations • EMNLP 2020 • Anna Currey, Prashant Mathur, Georgiana Dinu
Neural machine translation achieves impressive results in high-resource conditions, but performance often suffers when the input domain is low-resource.
1 code implementation • EMNLP 2021 • Prafulla Kumar Choubey, Anna Currey, Prashant Mathur, Georgiana Dinu
Targeted evaluations have found that machine translation systems often output incorrect gender in translations, even when the gender is clear from context.
no code implementations • WMT (EMNLP) 2021 • Md Mahfuz ibn Alam, Ivana Kvapilíková, Antonios Anastasopoulos, Laurent Besacier, Georgiana Dinu, Marcello Federico, Matthias Gallé, Kweonwoo Jung, Philipp Koehn, Vassilina Nikoulina
Language domains that require very careful use of terminology are abundant and reflect a significant part of the translation industry.
1 code implementation • WMT (EMNLP) 2020 • Greg Hanneman, Georgiana Dinu
The ability of machine translation (MT) models to correctly place markup is crucial to generating high-quality translations of formatted input.
no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe
The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.
no code implementations • 26 May 2023 • Gabriele Sarti, Phu Mon Htut, Xing Niu, Benjamin Hsu, Anna Currey, Georgiana Dinu, Maria Nadejde
Attribute-controlled translation (ACT) is a subtask of machine translation that involves controlling stylistic or linguistic attributes (like formality and gender) of translation outputs.
no code implementations • 19 May 2023 • Benjamin Hsu, Anna Currey, Xing Niu, Maria Nădejde, Georgiana Dinu
While the effect of PLT on quality is well-documented, we highlight a lesser-known effect: PLT can enhance a model's stability to model updates and input perturbations, a set of properties we call model inertia.
1 code implementation • 2 Nov 2022 • Anna Currey, Maria Nădejde, Raghavendra Pappagari, Mia Mayer, Stanislas Lauly, Xing Niu, Benjamin Hsu, Georgiana Dinu
As generic machine translation (MT) quality has improved, the need for targeted benchmarks that explore fine-grained aspects of quality has increased.
no code implementations • 19 Oct 2022 • Suvodeep Majumder, Stanislas Lauly, Maria Nadejde, Marcello Federico, Georgiana Dinu
This paper addresses the task of contextual translation using multi-segment models.
2 code implementations • Findings (NAACL) 2022 • Maria Nădejde, Anna Currey, Benjamin Hsu, Xing Niu, Marcello Federico, Georgiana Dinu
However, in many cases, multiple different translations are valid and the appropriate translation may depend on the intended target audience, characteristics of the speaker, or even the relationship between speakers.
1 code implementation • 24 Sep 2021 • Xing Niu, Georgiana Dinu, Prashant Mathur, Anna Currey
The training data used in NMT is rarely controlled with respect to specific attributes, such as word casing or gender, which can cause errors in translations.
no code implementations • 15 Apr 2021 • Prafulla Kumar Choubey, Anna Currey, Prashant Mathur, Georgiana Dinu
Targeted evaluations have found that machine translation systems often output incorrect gender, even when the gender is clear from context.
no code implementations • ACL 2020 • Xing Niu, Prashant Mathur, Georgiana Dinu, Yaser Al-Onaizan
Neural Machine Translation (NMT) models are sensitive to small perturbations in the input.
no code implementations • WS 2020 • Georgiana Dinu, Prashant Mathur, Marcello Federico, Stanislas Lauly, Yaser Al-Onaizan
A variety of natural language tasks require processing of textual data which contains a mix of natural language and formal languages such as mathematical expressions.
1 code implementation • ACL 2019 • Georgiana Dinu, Prashant Mathur, Marcello Federico, Yaser Al-Onaizan
This paper proposes a novel method to inject custom terminology into neural machine translation at run time.
no code implementations • ACL 2017 • Jian Ni, Georgiana Dinu, Radu Florian
However, annotating NER data by human is expensive and time-consuming, and can be quite difficult for a new language.
no code implementations • 13 Mar 2017 • Georgiana Dinu, Wael Hamza, Radu Florian
This paper describes an application of reinforcement learning to the mention detection task.
no code implementations • 24 Feb 2016 • Thien Huu Nguyen, Avirup Sil, Georgiana Dinu, Radu Florian
One of the key challenges in natural language processing (NLP) is to yield good performance across application domains and languages.
no code implementations • TACL 2015 • Angeliki Lazaridou, Georgiana Dinu, Adam Liska, Marco Baroni
By building on the recent "zero-shot learning" approach, and paying attention to the linguistic nature of attributes as noun modifiers, and specifically adjectives, we show that it is possible to tag images with attribute-denoting adjectives even when no training data containing the relevant annotation are available.
4 code implementations • 20 Dec 2014 • Georgiana Dinu, Angeliki Lazaridou, Marco Baroni
The zero-shot paradigm exploits vector-based word representations extracted from text corpora with unsupervised methods to learn general mapping functions from other feature spaces onto word space, where the words associated to the nearest neighbours of the mapped vectors are used as their linguistic labels.