no code implementations • EAMT 2020 • Frederic Blain, Nikolaos Aletras, Lucia Specia
However, QE models are often trained on noisy approximations of quality annotations derived from the proportion of post-edited words in translated sentences instead of direct human annotations of translation errors.
no code implementations • 16 Dec 2024 • Anthony Hughes, Nikolaos Aletras, Ning Ma
Language models (LMs) have shown outstanding performance in text summarization including sensitive domains such as medicine and law.
1 code implementation • 16 Dec 2024 • Atsuki Yamaguchi, Terufumi Morishita, Aline Villavicencio, Nikolaos Aletras
In this paper, we investigate the impact of using unlabeled target language data for VE on chat models for the first time.
no code implementations • 22 Oct 2024 • Miles Williams, George Chrysostomou, Nikolaos Aletras
In a post-training setting, state-of-the-art quantization and pruning methods require calibration data, a small set of unlabeled examples.
no code implementations • 4 Oct 2024 • Yida Mu, Mali Jin, Xingyi Song, Nikolaos Aletras
Research in natural language processing (NLP) for Computational Social Science (CSS) heavily relies on data from social media platforms.
2 code implementations • 17 Jun 2024 • Atsuki Yamaguchi, Aline Villavicencio, Nikolaos Aletras
Large language models (LLMs) have shown remarkable capabilities in many languages beyond English.
no code implementations • 25 Mar 2024 • Mali Jin, Daniel Preoţiuc-Pietro, A. Seza Doğruöz, Nikolaos Aletras
Bragging is the act of uttering statements that are likely to be positively viewed by others and it is extensively employed in human communication with the aim to build a positive self-image of oneself.
1 code implementation • 19 Mar 2024 • Zhixue Zhao, Nikolaos Aletras
Previous studies have explored how different factors affect faithfulness, mainly in the context of monolingual English models.
1 code implementation • 16 Feb 2024 • Atsuki Yamaguchi, Aline Villavicencio, Nikolaos Aletras
We also show that adapting LLMs that have been pre-trained on more balanced multilingual data results in downstream performance comparable to the original models.
no code implementations • 8 Jan 2024 • Peter Vickers, Loïc Barrault, Emilio Monti, Nikolaos Aletras
In Natural Language Processing (NLP) classification tasks such as topic categorisation and sentiment analysis, model generalizability is generally measured with standard metrics such as Accuracy, F-Measure, or AUC-ROC.
no code implementations • 16 Nov 2023 • Miles Williams, Nikolaos Aletras
Quantization and pruning form the foundation of compression for neural networks, enabling efficient inference for large language models (LLMs).
1 code implementation • 15 Nov 2023 • George Chrysostomou, Zhixue Zhao, Miles Williams, Nikolaos Aletras
Despite the remarkable performance of generative large language models (LLMs) on abstractive summarization, they face two significant challenges: their considerable size and tendency to hallucinate.
no code implementations • 26 Oct 2023 • Ahmed Alajrami, Katerina Margatina, Nikolaos Aletras
Understanding how and what pre-trained language models (PLMs) learn about language is an open challenge in natural language processing.
no code implementations • 11 Oct 2023 • Huiyin Xue, Nikolaos Aletras
Scaling pre-trained language models has resulted in large performance gains in various natural language processing tasks but comes with a large cost in memory requirements.
no code implementations • 9 Oct 2023 • Catalina Goanta, Nikolaos Aletras, Ilias Chalkidis, Sofia Ranchordas, Gerasimos Spanakis
Regulation studies are a rich source of knowledge on how to systematically deal with risk and uncertainty, as well as with scientific evidence, to evaluate and compare regulatory options.
no code implementations • 20 Sep 2023 • Yida Mu, Xingyi Song, Kalina Bontcheva, Nikolaos Aletras
A crucial aspect of a rumor detection model is its ability to generalize, particularly its ability to detect emerging, previously unknown rumors.
1 code implementation • 15 Sep 2023 • Miles Williams, Nikolaos Aletras
The extensive memory footprint of pre-trained language models (PLMs) can hinder deployment in memory-constrained settings, such as cloud environments or on-device.
1 code implementation • 14 Sep 2023 • Danae Sánchez Villegas, Daniel Preoţiuc-Pietro, Nikolaos Aletras
However, prior work on multimodal classification of social media posts has not yet addressed these challenges.
1 code implementation • 6 Sep 2023 • Danae Sánchez Villegas, Catalina Goanta, Nikolaos Aletras
Influencer marketing involves a wide range of strategies in which brands collaborate with popular content creators (i. e., influencers) to leverage their reach, trust, and impact on their audience to promote and endorse products or services.
1 code implementation • 26 May 2023 • Yue Feng, Yunlong Jiao, Animesh Prasad, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai
Further, it employs a fulfillment representation layer for learning how many task attributes have been fulfilled in the dialogue, an importance predictor component for calculating the importance of task attributes.
no code implementations • 23 May 2023 • Katerina Margatina, Timo Schick, Nikolaos Aletras, Jane Dwivedi-Yu
The remarkable advancements in large language models (LLMs) have significantly enhanced the performance in few-shot learning settings.
no code implementations • 23 May 2023 • Yida Mu, Ben P. Wu, William Thorne, Ambrose Robinson, Nikolaos Aletras, Carolina Scarton, Kalina Bontcheva, Xingyi Song
Instruction-tuned Large Language Models (LLMs) have exhibited impressive language understanding and the capacity to generate responses that follow specific prompts.
2 code implementations • 22 May 2023 • Zhengxiang Shi, Francesco Tonolini, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai, Yunlong Jiao
Semi-supervised learning (SSL) is a popular setting aiming to effectively utilize unlabelled data to improve model performance in downstream natural language processing (NLP) tasks.
no code implementations • 21 May 2023 • Katerina Margatina, Nikolaos Aletras
Active learning (AL) is a human-and-model-in-the-loop paradigm that iteratively selects informative unlabeled data for human annotation, aiming to improve over random sampling.
no code implementations • 18 May 2023 • Samuel Mensah, Kai Sun, Nikolaos Aletras
State-of-the-art target-oriented opinion word extraction (TOWE) models typically use BERT-based text encoders that operate on the word level, along with graph convolutional networks (GCNs) that incorporate syntactic information extracted from syntax trees.
1 code implementation • 17 May 2023 • Zhixue Zhao, Nikolaos Aletras
Widely used faithfulness metrics, such as sufficiency and comprehensiveness use a hard erasure criterion, i. e. entirely removing or retaining the top most important tokens ranked by a given FA and observing the changes in predictive likelihood.
no code implementations • 28 Feb 2023 • Kai Sun, Richong Zhang, Samuel Mensah, Nikolaos Aletras, Yongyi Mao, Xudong Liu
Inspired by the theoretical foundations in domain adaptation [2], we propose a new SSL approach that opts for selecting target samples whose model output from a domain-specific teacher and student network disagree on the unlabelled target data, in an effort to boost the target domain performance.
1 code implementation • 6 Feb 2023 • Yida Mu, Kalina Bontcheva, Nikolaos Aletras
New events emerge over time influencing the topics of rumors in social media.
1 code implementation • 17 Oct 2022 • Zhixue Zhao, George Chrysostomou, Kalina Bontcheva, Nikolaos Aletras
Explanation faithfulness of model predictions in natural language processing is typically evaluated on held-out data from the same temporal distribution as the training data (i. e. synchronous settings).
no code implementations • 14 Oct 2022 • Huiyin Xue, Nikolaos Aletras
These embeddings are subsequently fed into transformer layers for text classification.
no code implementations • 12 Oct 2022 • Wenzhe Li, Nikolaos Aletras
Graph-based text representation focuses on how text documents are represented as graphs for exploiting dependency information between tokens and documents within a corpus.
no code implementations • COLING 2022 • Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr
State-of-the-art approaches for hate-speech detection usually exhibit poor performance in out-of-domain settings.
1 code implementation • NAACL 2022 • Xiao Ao, Danae Sánchez Villegas, Daniel Preoţiuc-Pietro, Nikolaos Aletras
Parody is a figurative device used for mimicking entities for comedic or critical purposes.
1 code implementation • 21 Apr 2022 • Yida Mu, Pu Niu, Nikolaos Aletras
The phenomenon of misinformation spreading in social media has developed a new form of active citizens who focus on tackling the problem by refuting posts that might contain misinformation.
1 code implementation • 16 Apr 2022 • Mingchen Li, Junfan Chen, Samuel Mensah, Nikolaos Aletras, Xiulong Yang, Yang Ye
Thus, in this paper, we propose a Hierarchical N-Gram framework for Zero-Shot Link Prediction (HNZSLP), which considers the dependencies among character n-grams of the relation surface name for ZSLP.
1 code implementation • Findings (ACL) 2022 • Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr
In this paper, we propose to automatically identify and reduce spurious correlations using attribution methods with dynamic refinement of the list of terms that need to be regularized during training.
1 code implementation • ACL 2022 • Ahmed Alajrami, Nikolaos Aletras
Several pre-training objectives, such as masked language modeling (MLM), have been proposed to pre-train language models (e. g. BERT) with the aim of learning better language representations.
no code implementations • ACL 2022 • Mali Jin, Daniel Preoţiuc-Pietro, A. Seza Doğruöz, Nikolaos Aletras
Bragging is a speech act employed with the goal of constructing a favorable self-image through positive statements about oneself.
1 code implementation • ACL 2022 • George Chrysostomou, Nikolaos Aletras
Recent work in Natural Language Processing has focused on developing approaches that extract faithful explanations, either via identifying the most important tokens in the input (i. e. post-hoc explanations) or by designing inherently faithful models that first select the most important tokens and then use them to predict the correct label (i. e. select-then-predict models).
1 code implementation • ACL 2022 • Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Martin Katz, Nikolaos Aletras
Laws and their interpretations, legal arguments and agreements\ are typically expressed in writing, leading to the production of vast corpora of legal text.
Ranked #1 on Natural Language Understanding on LexGLUE
1 code implementation • EMNLP 2021 • Katerina Margatina, Giorgos Vernikos, Loïc Barrault, Nikolaos Aletras
Common acquisition functions for active learning use either uncertainty or diversity sampling, aiming to select difficult and diverse data points from the pool of unlabeled data, respectively.
1 code implementation • EMNLP 2021 • Atsuki Yamaguchi, George Chrysostomou, Katerina Margatina, Nikolaos Aletras
Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural language processing for learning text representations.
1 code implementation • EMNLP 2021 • Samuel Mensah, Kai Sun, Nikolaos Aletras
Target-oriented opinion words extraction (TOWE) (Fan et al., 2019b) is a new subtask of target-oriented sentiment analysis that aims to extract opinion words for a given aspect in text.
1 code implementation • EMNLP 2021 • Danae Sánchez Villegas, Nikolaos Aletras
Point-of-interest (POI) type prediction is the task of inferring the type of a place from where a social media post was shared.
1 code implementation • EMNLP 2021 • George Chrysostomou, Nikolaos Aletras
In this paper, we hypothesize that salient information extracted a priori from the training data can complement the task-specific information learned by the model during fine-tuning on a downstream task.
no code implementations • Findings (ACL) 2022 • Marina Fomicheva, Lucia Specia, Nikolaos Aletras
Recent Quality Estimation (QE) models based on multilingual pre-trained representations have achieved very competitive results when predicting the overall quality of translated sentences.
no code implementations • ACL 2021 • Peter Vickers, Nikolaos Aletras, Emilio Monti, Lo{\"\i}c Barrault
Visual Question Answering (VQA) methods aim at leveraging visual input to answer questions that may require complex reasoning over entities.
1 code implementation • Findings (ACL) 2021 • Amit Gajbhiye, Marina Fomicheva, Fernando Alva-Manchego, Frédéric Blain, Abiola Obamuyide, Nikolaos Aletras, Lucia Specia
Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations, making it applicable in real-time settings, such as translating online social media conversations.
no code implementations • Findings (ACL) 2021 • Danae Sánchez Villegas, Saeid Mokaram, Nikolaos Aletras
Finally, we provide an in-depth analysis of the limitations of our best-performing models and linguistic analysis to study the characteristics of political ads discourse.
no code implementations • Findings (ACL) 2021 • Dimitrios Tsarapatsanis, Nikolaos Aletras
Natural language processing (NLP) methods for analyzing legal text offer legal scholars and practitioners a range of tools allowing to empirically analyze law on a large scale.
1 code implementation • ACL 2021 • George Chrysostomou, Nikolaos Aletras
In this paper, we seek to improve the faithfulness of attention-based explanations for text classification.
1 code implementation • ACL 2022 • Katerina Margatina, Loïc Barrault, Nikolaos Aletras
Recent Active Learning (AL) approaches in Natural Language Processing (NLP) proposed using off-the-shelf pretrained language models (LMs).
1 code implementation • 16 Apr 2021 • George Chrysostomou, Nikolaos Aletras
Recent research on model interpretability in natural language processing extensively uses feature scoring methods for identifying which parts of the input are the most important for a model to make a prediction (i. e. explanation or rationale).
no code implementations • NAACL 2021 • Ilias Chalkidis, Manos Fergadiotis, Dimitrios Tsarapatsanis, Nikolaos Aletras, Ion Androutsopoulos, Prodromos Malakasiotis
We also release a new dataset comprising European Court of Human Rights cases, including annotations for paragraph-level rationales.
1 code implementation • NAACL 2021 • Mali Jin, Nikolaos Aletras
The speech act of complaining is used by humans to communicate a negative mismatch between reality and expectations as a reaction to an unfavorable situation.
no code implementations • COLING 2020 • Mali Jin, Nikolaos Aletras
Complaining is a speech act extensively used by humans to communicate a negative inconsistency between reality and expectations.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, Ion Androutsopoulos
Thus we propose a systematic investigation of the available strategies when applying BERT in specialised domains.
1 code implementation • EMNLP 2020 • Ilias Chalkidis, Manos Fergadiotis, Sotiris Kotitsas, Prodromos Malakasiotis, Nikolaos Aletras, Ion Androutsopoulos
Furthermore, we show that Transformer-based approaches outperform the state-of-the-art in two of the datasets, and we propose a new state-of-the-art method which combines BERT with LWANs.
no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Danae Sánchez Villegas, Daniel Preoţiuc-Pietro, Nikolaos Aletras
Physical places help shape how we perceive the experiences we have there.
1 code implementation • 29 May 2020 • Areej Alokaili, Nikolaos Aletras, Mark Stevenson
A topic is usually represented by a list of terms ranked by their probability but, since these can be difficult to interpret, various approaches have been developed to assign descriptive labels to topics.
3 code implementations • 21 May 2020 • Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia
Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time.
no code implementations • ACL 2020 • Antonis Maronikolakis, Danae Sanchez Villegas, Daniel Preotiuc-Pietro, Nikolaos Aletras
Parody is a figurative device used to imitate an entity for comedic or critical purposes and represents a widespread phenomenon in social media through many popular parody accounts.
no code implementations • IJCNLP 2019 • Twin Karmakharm, Nikolaos Aletras, Kalina Bontcheva
The system features a rumour annotation service that allows journalists to easily provide feedback for a given social media post through a web-based interface.
1 code implementation • ACL 2019 • Daniel Preotiuc-Pietro, Mihaela Gaman, Nikolaos Aletras
Complaining is a basic speech act regularly used in human and computer mediated communication to express a negative mismatch between reality and expectations in a particular situation.
no code implementations • ACL 2019 • Ilias Chalkidis, Ion Androutsopoulos, Nikolaos Aletras
Legal judgment prediction is the task of automatically predicting the outcome of a court case, given a text describing the case's facts.
Ranked #1 on Binary text classification on ECHR Non-Anonymized
no code implementations • WS 2019 • Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, Ion Androutsopoulos
We consider the task of Extreme Multi-Label Text Classification (XMTC) in the legal domain.
1 code implementation • WS 2019 • Areej Alokaili, Nikolaos Aletras, Mark Stevenson
Making their output interpretable is an important area of research with applications to areas such as the enhancement of exploratory search interfaces and the development of interpretable machine learning models.
2 code implementations • 30 Nov 2018 • Li Zhang, Heda Song, Nikolaos Aletras, Haiping Lu
Graph convolutional network (GCN) is an emerging neural network approach.
no code implementations • 26 Aug 2018 • Adam Tsakalidis, Nikolaos Aletras, Alexandra I. Cristea, Maria Liakata
Modelling user voting intention in social media is an important research area, with applications in analysing electorate behaviour, online political campaigning and advertising.
1 code implementation • 11 Apr 2018 • Nikolaos Aletras, Benjamin Paul Chamberlain
Inferring socioeconomic attributes of social media users such as occupation and income is an important problem in computational social science.
no code implementations • EACL 2017 • Ionut Sorodoc, Jey Han Lau, Nikolaos Aletras, Timothy Baldwin
Automatic topic labelling is the task of generating a succinct label that summarises the theme or subject of a topic, with the intention of reducing the cognitive load of end-users when interpreting these topics.
no code implementations • 1 Aug 2016 • Nikolaos Aletras, Arpit Mittal
Topics generated by topic models are usually represented by lists of $t$ terms or alternatively using short phrases and images.