no code implementations • LREC 2022 • Tomasz Korybski, Elena Davitti, Constantin Orasan, Sabine Braun
In this paper, we present a semi-automated workflow for live interlingual speech-to-text communication which seeks to reduce the shortcomings of existing ASR systems: a human respeaker works with a speaker-dependent speech recognition software (e. g., Dragon Naturally Speaking) to deliver punctuated same-language output of superior quality than obtained using out-of-the-box automatic speech recognition of the original speech.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • WASSA (ACL) 2022 • Shenbin Qian, Constantin Orasan, Diptesh Kanojia, Hadeel Saadany, Félix do Carmo
This paper summarises the submissions our team, SURREY-CTS-NLP has made for the WASSA 2022 Shared Task for the prediction of empathy, distress and emotion.
no code implementations • 21 Oct 2022 • Hadeel Saadany, Constantin Orasan, Emad Mohamed, Ashraf Tantawy
In the online world, Machine Translation (MT) systems are extensively used to translate User-Generated Text (UGT) such as reviews, tweets, and social media posts, where the main message is often the author's positive or negative attitude towards the topic of the text.
no code implementations • 2 May 2022 • Alistair Plum, Tharindu Ranasinghe, Spencer Jones, Constantin Orasan, Ruslan Mitkov
The dataset, which is aimed towards digital humanities (DH) and historical research, is automatically compiled by aligning sentences from Wikipedia articles with matching structured data from sources including Pantheon and Wikidata.
no code implementations • RANLP 2021 • Hadeel Saadany, Constantin Orasan, Emad Mohamed, Ashraf Tantawy
In translating text where sentiment is the main message, human translators give particular attention to sentiment-carrying words.
no code implementations • TRITON 2021 • Hadeel Saadany, Constantin Orasan
The adequacy of the whole process relies on the assumption that the evaluation metrics used give a reliable indication of the quality of the translation.
no code implementations • 20 Jun 2021 • Hadeel Saadany, Constantin Orasan, Rocio Caro Quintana, Felix Do Carmo, Leonardo Zilio
In this research, we assess whether automatic translation tools can be a successful real-life utility in transferring emotion in user-generated multilingual data such as tweets.
1 code implementation • ACL 2021 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
Most studies on word-level Quality Estimation (QE) of machine translation focus on language-specific models.
no code implementations • SEMEVAL 2020 • Tharindu Ranasinghe, Alistair Plum, Constantin Orasan, Ruslan Mitkov
This paper presents the RGCL team submission to SemEval 2020 Task 6: DeftEval, subtasks 1 and 2.
1 code implementation • COLING 2020 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
Recent years have seen big advances in the field of sentence-level quality estimation (QE), largely as a result of using neural-based architectures.
1 code implementation • RDSM (COLING) 2020 • Hadeel Saadany, Emad Mohamed, Constantin Orasan
One very common type of fake news is satire which comes in a form of a news website or an online platform that parodies reputable real news agencies to create a sarcastic version of reality.
no code implementations • COLING (WANLP) 2020 • Hadeel Saadany, Constantin Orasan
We address this problem by fine-tuning an NMT model with respect to sentiment polarity showing that this approach can significantly help with correcting sentiment errors detected in the online translation of Arabic UGC.
no code implementations • 13 Oct 2020 • Tharindu Ranasinghe, Alistair Plum, Constantin Orasan, Ruslan Mitkov
This paper presents the RGCL team submission to SemEval 2020 Task 6: DeftEval, subtasks 1 and 2.
1 code implementation • WMT (EMNLP) 2020 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
This paper presents the team TransQuest's participation in Sentence-Level Direct Assessment shared task in WMT 2020.
no code implementations • EAMT 2020 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
Matching and retrieving previously translated segments from a Translation Memory is the key functionality in Translation Memories systems.
no code implementations • RANLP 2019 • Victoria Yaneva, Constantin Orasan, Le An Ha, Natalia Ponomareva
NLP approaches to automatic text adaptation often rely on user-need guidelines which are generic and do not account for the differences between various types of target groups.
no code implementations • RANLP 2019 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
Calculating the Semantic Textual Similarity (STS) is an important research area in natural language processing which plays a significant role in many applications such as question answering, document summarisation, information retrieval and information extraction.
no code implementations • RANLP 2019 • Richard Evans, Constantin Orasan
The paper begins with our observation of challenges in the intrinsic evaluation of sentence simplification systems, which motivates the use of extrinsic evaluation of these systems with respect to other NLP tasks.
no code implementations • RANLP 2019 • Tharindu Ranasinghe, Constantin Orasan, Ruslan Mitkov
Calculating Semantic Textual Similarity (STS) plays a significant role in many applications such as question answering, document summarisation, information retrieval and information extraction.
Contextualised Word Representations
Information Retrieval
+5
no code implementations • RANLP 2019 • Alistair Plum, Tharindu Ranasinghe, Constantin Orasan
This paper compares how different machine learning classifiers can be used together with simple string matching and named entity recognition to detect locations in texts.
no code implementations • WS 2018 • Reshmi Gopalakrishna Pillai, Mike Thelwall, Constantin Orasan
Detecting stress from social media gives a non-intrusive and inexpensive alternative to traditional tools such as questionnaires or physiological sensors for monitoring mental state of individuals.
no code implementations • LREC 2012 • Marta Recasens, M. Ant{\`o}nia Mart{\'\i}, Constantin Orasan
We present an extension of the coreference annotation in the English NP4E and the Catalan AnCora-CA corpora with near-identity relations, which are borderline cases of coreference.
no code implementations • LREC 2012 • Irina Temnikova, Constantin Orasan, Ruslan Mitkov
This article presents a new linguistic resource in the form of Controlled Language (CL) guidelines for manual text simplification in the CM domain which aims to address high TC in the CM domain and produce clear messages to be used in crisis situations.