no code implementations • LREC 2012 • Marta Recasens, M. Ant{\`o}nia Mart{\'\i}, Constantin Orasan
We present an extension of the coreference annotation in the English NP4E and the Catalan AnCora-CA corpora with near-identity relations, which are borderline cases of coreference.
no code implementations • LREC 2012 • Irina Temnikova, Constantin Orasan, Ruslan Mitkov
This article presents a new linguistic resource in the form of Controlled Language (CL) guidelines for manual text simplification in the CM domain which aims to address high TC in the CM domain and produce clear messages to be used in crisis situations.
no code implementations • WS 2018 • Reshmi Gopalakrishna Pillai, Mike Thelwall, Constantin Orasan
Detecting stress from social media gives a non-intrusive and inexpensive alternative to traditional tools such as questionnaires or physiological sensors for monitoring mental state of individuals.
no code implementations • RANLP 2019 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
Calculating Semantic Textual Similarity (STS) plays a significant role in many applications such as question answering, document summarisation, information retrieval and information extraction.
Contextualised Word Representations Information Retrieval +6
no code implementations • RANLP 2019 • Alistair Plum, Tharindu Ranasinghe, Constantin Orasan
This paper compares how different machine learning classifiers can be used together with simple string matching and named entity recognition to detect locations in texts.
no code implementations • RANLP 2019 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
Calculating the Semantic Textual Similarity (STS) is an important research area in natural language processing which plays a significant role in many applications such as question answering, document summarisation, information retrieval and information extraction.
no code implementations • RANLP 2019 • Richard Evans, Constantin Orasan
The paper begins with our observation of challenges in the intrinsic evaluation of sentence simplification systems, which motivates the use of extrinsic evaluation of these systems with respect to other NLP tasks.
no code implementations • RANLP 2019 • Victoria Yaneva, Constantin Orasan, Le An Ha, Natalia Ponomareva
NLP approaches to automatic text adaptation often rely on user-need guidelines which are generic and do not account for the differences between various types of target groups.
no code implementations • EAMT 2020 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
Matching and retrieving previously translated segments from a Translation Memory is the key functionality in Translation Memories systems.
1 code implementation • WMT (EMNLP) 2020 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
This paper presents the team TransQuest's participation in Sentence-Level Direct Assessment shared task in WMT 2020.
no code implementations • 13 Oct 2020 • Tharindu Ranasinghe, Alistair Plum, Constantin Orasan, Ruslan Mitkov
This paper presents the RGCL team submission to SemEval 2020 Task 6: DeftEval, subtasks 1 and 2.
no code implementations • COLING (WANLP) 2020 • Hadeel Saadany, Constantin Orasan
We address this problem by fine-tuning an NMT model with respect to sentiment polarity showing that this approach can significantly help with correcting sentiment errors detected in the online translation of Arabic UGC.
1 code implementation • COLING 2020 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
Recent years have seen big advances in the field of sentence-level quality estimation (QE), largely as a result of using neural-based architectures.
1 code implementation • RDSM (COLING) 2020 • Hadeel Saadany, Emad Mohamed, Constantin Orasan
One very common type of fake news is satire which comes in a form of a news website or an online platform that parodies reputable real news agencies to create a sarcastic version of reality.
no code implementations • SEMEVAL 2020 • Tharindu Ranasinghe, Alistair Plum, Constantin Orasan, Ruslan Mitkov
This paper presents the RGCL team submission to SemEval 2020 Task 6: DeftEval, subtasks 1 and 2.
1 code implementation • ACL 2021 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
Most studies on word-level Quality Estimation (QE) of machine translation focus on language-specific models.
no code implementations • 20 Jun 2021 • Hadeel Saadany, Constantin Orasan, Rocio Caro Quintana, Felix Do Carmo, Leonardo Zilio
In this research, we assess whether automatic translation tools can be a successful real-life utility in transferring emotion in user-generated multilingual data such as tweets.
no code implementations • TRITON 2021 • Hadeel Saadany, Constantin Orasan
The adequacy of the whole process relies on the assumption that the evaluation metrics used give a reliable indication of the quality of the translation.
no code implementations • RANLP 2021 • Hadeel Saadany, Constantin Orasan, Emad Mohamed, Ashraf Tantawy
In translating text where sentiment is the main message, human translators give particular attention to sentiment-carrying words.
no code implementations • 2 May 2022 • Alistair Plum, Tharindu Ranasinghe, Spencer Jones, Constantin Orasan, Ruslan Mitkov
The dataset, which is aimed towards digital humanities (DH) and historical research, is automatically compiled by aligning sentences from Wikipedia articles with matching structured data from sources including Pantheon and Wikidata.
no code implementations • 21 Oct 2022 • Hadeel Saadany, Constantin Orasan, Emad Mohamed, Ashraf Tantawy
In the online world, Machine Translation (MT) systems are extensively used to translate User-Generated Text (UGT) such as reviews, tweets, and social media posts, where the main message is often the author's positive or negative attitude towards the topic of the text.
1 code implementation • 20 Jun 2023 • Shenbin Qian, Constantin Orasan, Felix Do Carmo, Qiuliang Li, Diptesh Kanojia
In this paper, we focus on how current Machine Translation (MT) tools perform on the translation of emotion-loaded texts by evaluating outputs from Google Translate according to a framework proposed in this paper.
no code implementations • 1 Dec 2023 • Archchana Sindhujan, Diptesh Kanojia, Constantin Orasan, Tharindu Ranasinghe
Quality Estimation (QE) systems are important in situations where it is necessary to assess the quality of translations, but there is no reference available.
no code implementations • 6 Feb 2024 • Jaleh Delfani, Constantin Orasan, Hadeel Saadany, Ozlem Temizoz, Eleanor Taylor-Stilgoe, Diptesh Kanojia, Sabine Braun, Barbara Schouten
This study explores the use of Google Translate (GT) for translating mental healthcare (MHealth) information and evaluates its accuracy, comprehensibility, and implications for multilingual healthcare communication through analysing GT output in the MHealth domain from English to Persian, Arabic, Turkish, Romanian, and Spanish.
no code implementations • LREC 2022 • Tomasz Korybski, Elena Davitti, Constantin Orasan, Sabine Braun
In this paper, we present a semi-automated workflow for live interlingual speech-to-text communication which seeks to reduce the shortcomings of existing ASR systems: a human respeaker works with a speaker-dependent speech recognition software (e. g., Dragon Naturally Speaking) to deliver punctuated same-language output of superior quality than obtained using out-of-the-box automatic speech recognition of the original speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • WASSA (ACL) 2022 • Shenbin Qian, Constantin Orasan, Diptesh Kanojia, Hadeel Saadany, Félix do Carmo
This paper summarises the submissions our team, SURREY-CTS-NLP has made for the WASSA 2022 Shared Task for the prediction of empathy, distress and emotion.