1 code implementation • 29 Jan 2024 • Nikita Moghe, Arnisa Fazla, Chantal Amrhein, Tom Kocmi, Mark Steedman, Alexandra Birch, Rico Sennrich, Liane Guillou
We benchmark metric performance, assess their incremental performance over successive campaigns, and measure their sensitivity to a range of linguistic phenomena.
no code implementations • 2 Nov 2023 • Chantal Amrhein, Nikita Moghe, Liane Guillou
We benchmark the performance of segmentlevel metrics submitted to WMT 2023 using the ACES Challenge Set (Amrhein et al., 2022).
no code implementations • 20 Dec 2022 • Nikita Moghe, Evgeniia Razumovskaia, Liane Guillou, Ivan Vulić, Anna Korhonen, Alexandra Birch
We use MULTI3NLU++ to benchmark state-of-the-art multilingual models for the NLU tasks of intent detection and slot labelling for TOD systems in the multilingual setting.
1 code implementation • 27 Oct 2022 • Chantal Amrhein, Nikita Moghe, Liane Guillou
As machine translation (MT) metrics improve their correlation with human judgement every year, it is crucial to understand the limitations of such metrics at the segment level.
Ranked #1 on Machine Translation on ACES
no code implementations • 6 Jun 2022 • Nick Ferguson, Liane Guillou, Kwabena Nuamah, Alan Bundy
Our two main conclusions are that cleaning of LC-QuAD 2. 0 is required as the errors present can affect evaluation; and that, due to limitations of FRANK's parser, paraphrase generation is not a method which we can rely on to improve the variety of natural language questions that FRANK can answer.
1 code implementation • Findings (ACL) 2022 • Tianyi Li, Sabine Weber, Mohammad Javad Hosseini, Liane Guillou, Mark Steedman
Predicate entailment detection is a crucial task for question-answering from text, where previous work has explored unsupervised learning of entailment graphs from typed open relation triples.
1 code implementation • EMNLP (insights) 2021 • Liane Guillou, Sander Bijl de Vroe, Mark Johnson, Mark Steedman
Understanding linguistic modality is widely seen as important for downstream tasks such as Question Answering and Knowledge Graph Population.
1 code implementation • COLING (TextGraphs) 2020 • Liane Guillou, Sander Bijl de Vroe, Mohammad Javad Hosseini, Mark Johnson, Mark Steedman
We present a novel method for injecting temporality into entailment graphs to address the problem of spurious entailments, which may arise from similar but temporally distinct events involving the same pair of entities.
1 code implementation • ACL (CASE) 2021 • Sander Bijl de Vroe, Liane Guillou, Miloš Stanojević, Nick McKenna, Mark Steedman
Language provides speakers with a rich system of modality for expressing thoughts about events, without being committed to their actual occurrence.
no code implementations • EMNLP 2021 • Nick McKenna, Liane Guillou, Mohammad Javad Hosseini, Sander Bijl de Vroe, Mark Johnson, Mark Steedman
Drawing inferences between open-domain natural language predicates is a necessity for true language understanding.
no code implementations • WS 2016 • Liane Guillou, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber, Andrei Popescu-Belis
We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction.
no code implementations • WS 2018 • Liane Guillou, Christian Hardmeier, Ekaterina Lapshinova-Koltunski, Sharid Lo{\'a}iciga
We evaluate the output of 16 English-to-German MT systems with respect to the translation of pronouns in the context of the WMT 2018 competition.
1 code implementation • 30 Aug 2018 • Christian Hardmeier, Liane Guillou
Pronouns are a long-standing challenge in machine translation.
no code implementations • EMNLP 2018 • Liane Guillou, Christian Hardmeier
We compare the performance of the APT and AutoPRF metrics for pronoun translation against a manually annotated dataset comprising human judgements as to the correctness of translations of the PROTEST test suite.
no code implementations • EMNLP 2017 • Sharid Lo{\'a}iciga, Liane Guillou, Christian Hardmeier
In this paper, we address the problem of predicting one of three functions for the English pronoun {`}it{'}: anaphoric, event reference or pleonastic.
no code implementations • WS 2016 • Ond{\v{r}}ej Bojar, Christian Buck, Rajen Chatterjee, Christian Federmann, Liane Guillou, Barry Haddow, Matthias Huck, Antonio Jimeno Yepes, Aur{\'e}lie N{\'e}v{\'e}ol, Mariana Neves, Pavel Pecina, Martin Popel, Philipp Koehn, Christof Monz, Matteo Negri, Matt Post, Lucia Specia, Karin Verspoor, J{\"o}rg Tiedemann, Marco Turchi
no code implementations • LREC 2016 • Liane Guillou, Christian Hardmeier
We present PROTEST, a test suite for the evaluation of pronoun translation by MT systems.
no code implementations • LREC 2014 • Liane Guillou, Christian Hardmeier, Aaron Smith, J{\"o}rg Tiedemann, Bonnie Webber
We present ParCor, a parallel corpus of texts in which pronoun coreference ― reduced coreference in which pronouns are used as referring expressions ― has been annotated.