no code implementations • LREC 2022 • Adem Ajvazi, Christian Hardmeier
Social media are a central part of people’s lives.
1 code implementation • NoDaLiDa 2021 • Chaojun Wang, Christian Hardmeier, Rico Sennrich
They also highlight blind spots in automatic methods for targeted evaluation and demonstrate the need for human assessment to evaluate document-level translation quality reliably.
1 code implementation • LREC 2022 • Ekaterina Lapshinova-Koltunski, Pedro Augusto Ferreira, Elina Lartaud, Christian Hardmeier
Similar to the previous version, this corpus has been created to address translation of coreference across languages, a phenomenon still challenging for machine translation (MT) and other multilingual natural language processing (NLP) applications.
no code implementations • WMT (EMNLP) 2020 • Nikita Moghe, Christian Hardmeier, Rachel Bawden
Our baseline systems are transformer-big models that are pre-trained on the WMT’19 News Translation task and fine-tuned on pseudo-in-domain web crawled data and in-domain task data.
no code implementations • AACL (iwdp) 2020 • Christian Hardmeier
The realisation of those cohesive structures is subject to different constraints and varying preferences in different languages.
no code implementations • ACL (unimplicit) 2021 • Ahmed Ruby, Christian Hardmeier, Sara Stymne
Exploring aspects of sentential meaning that are implicit or underspecified in context is important for sentence understanding.
no code implementations • COLING (CRAC) 2020 • Ekaterina Lapshinova-Koltunski, Marie-Pauline Krielke, Christian Hardmeier
We present a study focusing on variation of coreferential devices in English original TED talks and news texts and their German translations.
no code implementations • 17 Feb 2025 • Marco Antonio Stranisci, Christian Hardmeier
Data filtering strategies are a crucial component to develop safe Large Language Models (LLM), since they support the removal of harmful contents from pretraining datasets.
no code implementations • 27 Jan 2025 • Jorge del Pozo Lérida, Kamil Kojs, János Máté, Mikołaj Antoni Barański, Christian Hardmeier
Large Language Models (LLMs) have become state-of-the-art in Machine Translation (MT), often trained on massive bilingual parallel corpora scraped from the web, that contain low-quality entries and redundant information, leading to significant computational challenges.
no code implementations • 19 Dec 2024 • Gongbo Tang, Christian Hardmeier
Our proposed model outperforms the baseline Transformer model in terms of APT and BLEU scores, this confirms our hypothesis that we can improve pronoun translation by paying additional attention to source mentions, and shows that our introduced additional modules do not have negative effect on the general translation quality.
no code implementations • 31 Aug 2024 • Daniel Varab, Christian Hardmeier
Recent work has suggested that end-to-end system designs for cross-lingual summarization are competitive solutions that perform on par or even better than traditional pipelined designs.
no code implementations • 13 Feb 2024 • Paul Engelmann, Peter Brunsgaard Trolle, Christian Hardmeier
Dehumanization is a mental process that enables the exclusion and ill treatment of a group of people.
no code implementations • 28 May 2023 • Gongbo Tang, Christian Hardmeier
Coreference resolution is the task of finding expressions that refer to the same entity in a text.
1 code implementation • 20 Oct 2022 • Dennis Ulmer, Jes Frellsen, Christian Hardmeier
We investigate the problem of determining the predictive confidence (or, conversely, uncertainty) of a neural classifier through the lens of low-resource languages.
1 code implementation • 14 Apr 2022 • Dennis Ulmer, Christian Hardmeier, Jes Frellsen
A lot of Machine Learning (ML) and Deep Learning (DL) research is of an empirical nature.
1 code implementation • 13 Apr 2022 • Dennis Ulmer, Elisa Bassignana, Max Müller-Eberstein, Daniel Varab, Mike Zhang, Rob van der Goot, Christian Hardmeier, Barbara Plank
The field of Deep Learning (DL) has undergone explosive growth during the last decade, with a substantial impact on Natural Language Processing (NLP) as well.
no code implementations • 1 Nov 2021 • Sharid Loáiciga, Luca Bevacqua, Christian Hardmeier
We present an unsupervised method to detect English unergative and unaccusative verbs.
no code implementations • 6 Oct 2021 • Dennis Ulmer, Christian Hardmeier, Jes Frellsen
Popular approaches for quantifying predictive uncertainty in deep neural networks often involve distributions over weights or multiple models, for instance via Markov Chain sampling, ensembling, or Monte Carlo dropout.
no code implementations • 7 Apr 2021 • Christian Hardmeier, Marta R. Costa-jussà, Kellie Webster, Will Radford, Su Lin Blodgett
At the Workshop on Gender Bias in NLP (GeBNLP), we'd like to encourage authors to give explicit consideration to the wider aspects of bias and its social implications.
no code implementations • 9 Jul 2020 • Ali Basirat, Christian Hardmeier, Joakim Nivre
The effect of these generalizations on the word vectors is intrinsically studied with regard to the spread and the discriminability of the word vectors.
no code implementations • LREC 2020 • Sharid Lo{\'a}iciga, Christian Hardmeier, Asad Sayeed
Non-nominal co-reference is much less studied than nominal coreference, partly because of the lack of annotated corpora.
no code implementations • WS 2016 • Liane Guillou, Christian Hardmeier, Preslav Nakov, Sara Stymne, Jörg Tiedemann, Yannick Versley, Mauro Cettolo, Bonnie Webber, Andrei Popescu-Belis
We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction.
no code implementations • EMNLP 2018 • Eva Vanmassenhove, Christian Hardmeier, Andy Way
Our contribution is two-fold: (1) the compilation of large datasets with speaker information for 20 language pairs, and (2) a simple set of experiments that incorporate gender information into NMT for multiple language pairs.
no code implementations • WS 2019 • Kellie Webster, Marta R. Costa-juss{\`a}, Christian Hardmeier, Will Radford
The 1st ACL workshop on Gender Bias in Natural Language Processing included a shared task on gendered ambiguous pronoun (GAP) resolution.
no code implementations • WS 2019 • Jenny Kunz, Christian Hardmeier
We explore different approaches to explicit entity modelling in language models (LM).
no code implementations • WS 2019 • Ekaterina Lapshinova-Koltunski, Sharid Lo{\'a}iciga, Christian Hardmeier, Pauline Krielke
In the present paper, we deal with incongruences in English-German multilingual coreference annotation and present automated methods to discover them.
no code implementations • WS 2018 • Margita {\v{S}}o{\v{s}}tari{\'c}, Christian Hardmeier, Sara Stymne
We present an analysis of a number of coreference phenomena in English-Croatian human and machine translations.
no code implementations • WS 2018 • Liane Guillou, Christian Hardmeier, Ekaterina Lapshinova-Koltunski, Sharid Lo{\'a}iciga
We evaluate the output of 16 English-to-German MT systems with respect to the translation of pronouns in the context of the WMT 2018 competition.
1 code implementation • 30 Aug 2018 • Christian Hardmeier, Liane Guillou
Pronouns are a long-standing challenge in machine translation.
no code implementations • EMNLP 2018 • Liane Guillou, Christian Hardmeier
We compare the performance of the APT and AutoPRF metrics for pronoun translation against a manually annotated dataset comprising human judgements as to the correctness of translations of the PROTEST test suite.
1 code implementation • TACL 2018 • Yan Shao, Christian Hardmeier, Joakim Nivre
Word segmentation is a low-level NLP task that is non-trivial for a considerable number of languages.
no code implementations • WS 2018 • Christian Hardmeier, Luca Bevacqua, Sharid Lo{\'a}iciga, Hannah Rohde
Proper names of organisations are a special case of collective nouns.
no code implementations • WS 2018 • Sharid Lo{\'a}iciga, Luca Bevacqua, Hannah Rohde, Christian Hardmeier
Anaphora resolution systems require both an enumeration of possible candidate antecedents and an identification process of the antecedent.
no code implementations • IJCNLP 2017 • Yan Shao, Christian Hardmeier, Joakim Nivre
We extensively analyse the correlations and drawbacks of conventionally employed evaluation metrics for word segmentation.
no code implementations • WS 2017 • Ekaterina Lapshinova-Koltunski, Christian Hardmeier
In this paper, we analyse alignment discrepancies for discourse structures in English-German parallel data {--} sentence pairs, in which discourse structures in target or source texts have no alignment in the corresponding parallel sentences.
no code implementations • WS 2017 • Christian Hardmeier
This paper describes the UU-Hardmeier system submitted to the DiscoMT 2017 shared task on cross-lingual pronoun prediction.
no code implementations • WS 2017 • Sharid Lo{\'a}iciga, Sara Stymne, Preslav Nakov, Christian Hardmeier, J{\"o}rg Tiedemann, Mauro Cettolo, Yannick Versley
We describe the design, the setup, and the evaluation results of the DiscoMT 2017 shared task on cross-lingual pronoun prediction.
no code implementations • EMNLP 2017 • Sharid Lo{\'a}iciga, Liane Guillou, Christian Hardmeier
In this paper, we address the problem of predicting one of three functions for the English pronoun {`}it{'}: anaphoric, event reference or pleonastic.
1 code implementation • IJCNLP 2017 • Yan Shao, Christian Hardmeier, Jörg Tiedemann, Joakim Nivre
We present a character-based model for joint segmentation and POS tagging for Chinese.
no code implementations • COLING 2016 • Christian Hardmeier
Historical texts are challenging for natural language processing because they differ linguistically from modern texts and because of their lack of orthographical and grammatical standardisation.
no code implementations • LREC 2016 • Liane Guillou, Christian Hardmeier
We present PROTEST, a test suite for the evaluation of pronoun translation by MT systems.
no code implementations • TACL 2015 • Daniel Beck, Trevor Cohn, Christian Hardmeier, Lucia Specia
Structural kernels are a flexible learning paradigm that has been widely used in Natural Language Processing.
no code implementations • LREC 2014 • Liane Guillou, Christian Hardmeier, Aaron Smith, J{\"o}rg Tiedemann, Bonnie Webber
We present ParCor, a parallel corpus of texts in which pronoun coreference ― reduced coreference in which pronouns are used as referring expressions ― has been annotated.