Search Results for author: Nikolaos Aletras

Found 77 papers, 35 papers with code

Quality In, Quality Out: Learning from Actual Mistakes

no code implementations • EAMT 2020 • Frederic Blain, Nikolaos Aletras, Lucia Specia

However, QE models are often trained on noisy approximations of quality annotations derived from the proportion of post-edited words in translated sentences instead of direct human annotations of translation errors.

Machine Translation Sentence +2

Paper
Add Code

Who is bragging more online? A large scale analysis of bragging in social media

no code implementations • 25 Mar 2024 • Mali Jin, Daniel Preoţiuc-Pietro, A. Seza Doğruöz, Nikolaos Aletras

Bragging is the act of uttering statements that are likely to be positively viewed by others and it is extensively employed in human communication with the aim to build a positive self-image of oneself.

Paper
Add Code

Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models

1 code implementation • 19 Mar 2024 • Zhixue Zhao, Nikolaos Aletras

Previous studies have explored how different factors affect faithfulness, mainly in the context of monolingual English models.

Paper
Code

An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Generative LLM Inference

1 code implementation • 16 Feb 2024 • Atsuki Yamaguchi, Aline Villavicencio, Nikolaos Aletras

We also show that adapting LLMs that have been pre-trained on more balanced multilingual data results in downstream performance comparable to the original models.

Natural Language Understanding

Paper
Code

We Need to Talk About Classification Evaluation Metrics in NLP

no code implementations • 8 Jan 2024 • Peter Vickers, Loïc Barrault, Emilio Monti, Nikolaos Aletras

In Natural Language Processing (NLP) classification tasks such as topic categorisation and sentiment analysis, model generalizability is generally measured with standard metrics such as Accuracy, F-Measure, or AUC-ROC.

Machine Translation Natural Language Understanding +2

Paper
Add Code

How Does Calibration Data Affect the Post-training Pruning and Quantization of Large Language Models?

no code implementations • 16 Nov 2023 • Miles Williams, Nikolaos Aletras

Pruning and quantization form the foundation of model compression for neural networks, enabling efficient inference for large language models (LLMs).

Model Compression Quantization

Paper
Add Code

Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization

1 code implementation • 15 Nov 2023 • George Chrysostomou, Zhixue Zhao, Miles Williams, Nikolaos Aletras

Despite the remarkable performance of generative large language models (LLMs) on abstractive summarization, they face two significant challenges: their considerable size and tendency to hallucinate.

Abstractive Text Summarization Hallucination +1

Paper
Code

Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?

no code implementations • 26 Oct 2023 • Ahmed Alajrami, Katerina Margatina, Nikolaos Aletras

Understanding how and what pre-trained language models (PLMs) learn about language is an open challenge in natural language processing.

Paper
Add Code

Pit One Against Many: Leveraging Attention-head Embeddings for Parameter-efficient Multi-head Attention

no code implementations • 11 Oct 2023 • Huiyin Xue, Nikolaos Aletras

Scaling pre-trained language models has resulted in large performance gains in various natural language processing tasks but comes with a large cost in memory requirements.

Paper
Add Code

Regulation and NLP (RegNLP): Taming Large Language Models

no code implementations • 9 Oct 2023 • Catalina Goanta, Nikolaos Aletras, Ilias Chalkidis, Sofia Ranchordas, Gerasimos Spanakis

Regulation studies are a rich source of knowledge on how to systematically deal with risk and uncertainty, as well as with scientific evidence, to evaluate and compare regulatory options.

Ethics

Paper
Add Code

Examining the Limitations of Computational Rumor Detection Models Trained on Static Datasets

no code implementations • 20 Sep 2023 • Yida Mu, Xingyi Song, Kalina Bontcheva, Nikolaos Aletras

A crucial aspect of a rumor detection model is its ability to generalize, particularly its ability to detect emerging, previously unknown rumors.

Paper
Add Code

Frustratingly Simple Memory Efficiency for Pre-trained Language Models via Dynamic Embedding Pruning

1 code implementation • 15 Sep 2023 • Miles Williams, Nikolaos Aletras

The extensive memory footprint of pre-trained language models (PLMs) can hinder deployment in memory-constrained settings, such as cloud environments or on-device.

Paper
Code

Improving Multimodal Classification of Social Media Posts by Leveraging Image-Text Auxiliary Tasks

1 code implementation • 14 Sep 2023 • Danae Sánchez Villegas, Daniel Preoţiuc-Pietro, Nikolaos Aletras

However, prior work on multimodal classification of social media posts has not yet addressed these challenges.

Image-text matching Sarcasm Detection +3

Paper
Code

A Multimodal Analysis of Influencer Content on Twitter

1 code implementation • 6 Sep 2023 • Danae Sánchez Villegas, Catalina Goanta, Nikolaos Aletras

Influencer marketing involves a wide range of strategies in which brands collaborate with popular content creators (i. e., influencers) to leverage their reach, trust, and impact on their audience to promote and endorse products or services.

Marketing

Paper
Code

Schema-Guided User Satisfaction Modeling for Task-Oriented Dialogues

1 code implementation • 26 May 2023 • Yue Feng, Yunlong Jiao, Animesh Prasad, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai

Further, it employs a fulfillment representation layer for learning how many task attributes have been fulfilled in the dialogue, an importance predictor component for calculating the importance of task attributes.

Attribute Language Modelling +1

Paper
Code

Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science

no code implementations • 23 May 2023 • Yida Mu, Ben P. Wu, William Thorne, Ambrose Robinson, Nikolaos Aletras, Carolina Scarton, Kalina Bontcheva, Xingyi Song

Instruction-tuned Large Language Models (LLMs) have exhibited impressive language understanding and the capacity to generate responses that follow specific prompts.

Zero-Shot Learning

Paper
Add Code

Active Learning Principles for In-Context Learning with Large Language Models

no code implementations • 23 May 2023 • Katerina Margatina, Timo Schick, Nikolaos Aletras, Jane Dwivedi-Yu

The remarkable advancements in large language models (LLMs) have significantly enhanced the performance in few-shot learning settings.

Active Learning Few-Shot Learning +1

Paper
Add Code

Rethinking Semi-supervised Learning with Language Models

2 code implementations • 22 May 2023 • Zhengxiang Shi, Francesco Tonolini, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai, Yunlong Jiao

Semi-supervised learning (SSL) is a popular setting aiming to effectively utilize unlabelled data to improve model performance in downstream natural language processing (NLP) tasks.

Pseudo Label Semi-Supervised Text Classification +1

Paper
Code

On the Limitations of Simulating Active Learning

no code implementations • 21 May 2023 • Katerina Margatina, Nikolaos Aletras

Active learning (AL) is a human-and-model-in-the-loop paradigm that iteratively selects informative unlabeled data for human annotation, aiming to improve over random sampling.

Active Learning Fairness

Paper
Add Code

Trading Syntax Trees for Wordpieces: Target-oriented Opinion Words Extraction with Wordpieces and Aspect Enhancement

no code implementations • 18 May 2023 • Samuel Mensah, Kai Sun, Nikolaos Aletras

State-of-the-art target-oriented opinion word extraction (TOWE) models typically use BERT-based text encoders that operate on the word level, along with graph convolutional networks (GCNs) that incorporate syntactic information extracted from syntax trees.

Sentence target-oriented opinion words extraction

Paper
Add Code

Incorporating Attribution Importance for Improving Faithfulness Metrics

1 code implementation • 17 May 2023 • Zhixue Zhao, Nikolaos Aletras

Widely used faithfulness metrics, such as sufficiency and comprehensiveness use a hard erasure criterion, i. e. entirely removing or retaining the top most important tokens ranked by a given FA and observing the changes in predictive likelihood.

Paper
Code

Self-training through Classifier Disagreement for Cross-Domain Opinion Target Extraction

no code implementations • 28 Feb 2023 • Kai Sun, Richong Zhang, Samuel Mensah, Nikolaos Aletras, Yongyi Mao, Xudong Liu

Inspired by the theoretical foundations in domain adaptation [2], we propose a new SSL approach that opts for selecting target samples whose model output from a domain-specific teacher and student network disagree on the unlabelled target data, in an effort to boost the target domain performance.

Aspect Extraction Domain Adaptation +1

Paper
Add Code

It's about Time: Rethinking Evaluation on Rumor Detection Benchmarks using Chronological Splits

1 code implementation • 6 Feb 2023 • Yida Mu, Kalina Bontcheva, Nikolaos Aletras

New events emerge over time influencing the topics of rumors in social media.

Paper
Code

On the Impact of Temporal Concept Drift on Model Explanations

1 code implementation • 17 Oct 2022 • Zhixue Zhao, George Chrysostomou, Kalina Bontcheva, Nikolaos Aletras

Explanation faithfulness of model predictions in natural language processing is typically evaluated on held-out data from the same temporal distribution as the training data (i. e. synchronous settings).

Text Classification

Paper
Code

HashFormers: Towards Vocabulary-independent Pre-trained Transformers

no code implementations • 14 Oct 2022 • Huiyin Xue, Nikolaos Aletras

These embeddings are subsequently fed into transformer layers for text classification.

text-classification Text Classification

Paper
Add Code

Improving Graph-Based Text Representations with Character and Word Level N-grams

no code implementations • 12 Oct 2022 • Wenzhe Li, Nikolaos Aletras

Graph-based text representation focuses on how text documents are represented as graphs for exploiting dependency information between tokens and documents within a corpus.

Graph Representation Learning text-classification +2

Paper
Add Code

Domain Classification-based Source-specific Term Penalization for Domain Adaptation in Hate-speech Detection

no code implementations • COLING 2022 • Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr

State-of-the-art approaches for hate-speech detection usually exhibit poor performance in out-of-domain settings.

Domain Adaptation domain classification +1

Paper
Add Code

Combining Humor and Sarcasm for Improving Political Parody Detection

1 code implementation • NAACL 2022 • Xiao Ao, Danae Sánchez Villegas, Daniel Preoţiuc-Pietro, Nikolaos Aletras

Parody is a figurative device used for mimicking entities for comedic or critical purposes.

Paper
Code

Identifying and Characterizing Active Citizens who Refute Misinformation in Social Media

1 code implementation • 21 Apr 2022 • Yida Mu, Pu Niu, Nikolaos Aletras

The phenomenon of misinformation spreading in social media has developed a new form of active citizens who focus on tackling the problem by refuting posts that might contain misinformation.

Misinformation

Paper
Code

A Hierarchical N-Gram Framework for Zero-Shot Link Prediction

1 code implementation • 16 Apr 2022 • Mingchen Li, Junfan Chen, Samuel Mensah, Nikolaos Aletras, Xiulong Yang, Yang Ye

Thus, in this paper, we propose a Hierarchical N-Gram framework for Zero-Shot Link Prediction (HNZSLP), which considers the dependencies among character n-grams of the relation surface name for ZSLP.

Knowledge Graphs Link Prediction +1

Paper
Code

Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection

1 code implementation • Findings (ACL) 2022 • Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr

In this paper, we propose to automatically identify and reduce spurious correlations using attribution methods with dynamic refinement of the list of terms that need to be regularized during training.

Hate Speech Detection

Paper
Code

How does the pre-training objective affect what large language models learn about linguistic properties?

1 code implementation • ACL 2022 • Ahmed Alajrami, Nikolaos Aletras

Several pre-training objectives, such as masked language modeling (MLM), have been proposed to pre-train language models (e. g. BERT) with the aim of learning better language representations.

Language Modelling Masked Language Modeling

Paper
Code

Automatic Identification and Classification of Bragging in Social Media

no code implementations • ACL 2022 • Mali Jin, Daniel Preoţiuc-Pietro, A. Seza Doğruöz, Nikolaos Aletras

Bragging is a speech act employed with the goal of constructing a favorable self-image through positive statements about oneself.

Classification Multi-class Classification +1

Paper
Add Code

An Empirical Study on Explanations in Out-of-Domain Settings

1 code implementation • ACL 2022 • George Chrysostomou, Nikolaos Aletras

Recent work in Natural Language Processing has focused on developing approaches that extract faithful explanations, either via identifying the most important tokens in the input (i. e. post-hoc explanations) or by designing inherently faithful models that first select the most important tokens and then use them to predict the correct label (i. e. select-then-predict models).

Paper
Code

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

1 code implementation • ACL 2022 • Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Martin Katz, Nikolaos Aletras

Laws and their interpretations, legal arguments and agreements\ are typically expressed in writing, leading to the production of vast corpora of legal text.

Ranked #1 on Natural Language Understanding on LexGLUE

Multi-class Classification Multi-Label Classification +3

162

Paper
Code

Active Learning by Acquiring Contrastive Examples

1 code implementation • EMNLP 2021 • Katerina Margatina, Giorgos Vernikos, Loïc Barrault, Nikolaos Aletras

Common acquisition functions for active learning use either uncertainty or diversity sampling, aiming to select difficult and diverse data points from the pool of unlabeled data, respectively.

Active Learning Natural Language Understanding

112

Paper
Code

Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

1 code implementation • EMNLP 2021 • Atsuki Yamaguchi, George Chrysostomou, Katerina Margatina, Nikolaos Aletras

Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural language processing for learning text representations.

Language Modelling Masked Language Modeling +1

Paper
Code

An Empirical Study on Leveraging Position Embeddings for Target-oriented Opinion Words Extraction

1 code implementation • EMNLP 2021 • Samuel Mensah, Kai Sun, Nikolaos Aletras

Target-oriented opinion words extraction (TOWE) (Fan et al., 2019b) is a new subtask of target-oriented sentiment analysis that aims to extract opinion words for a given aspect in text.

Position target-oriented opinion words extraction +1

Paper
Code

Point-of-Interest Type Prediction using Text and Images

1 code implementation • EMNLP 2021 • Danae Sánchez Villegas, Nikolaos Aletras

Point-of-interest (POI) type prediction is the task of inferring the type of a place from where a social media post was shared.

Type prediction Vocal Bursts Type Prediction

Paper
Code

Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience

1 code implementation • EMNLP 2021 • George Chrysostomou, Nikolaos Aletras

In this paper, we hypothesize that salient information extracted a priori from the training data can complement the task-specific information learned by the model during fine-tuning on a downstream task.

Paper
Code

Translation Error Detection as Rationale Extraction

no code implementations • Findings (ACL) 2022 • Marina Fomicheva, Lucia Specia, Nikolaos Aletras

Recent Quality Estimation (QE) models based on multilingual pre-trained representations have achieved very competitive results when predicting the overall quality of translated sentences.

Sentence Translation

Paper
Add Code

In Factuality: Efficient Integration of Relevant Facts for Visual Question Answering

no code implementations • ACL 2021 • Peter Vickers, Nikolaos Aletras, Emilio Monti, Lo{\"\i}c Barrault

Visual Question Answering (VQA) methods aim at leveraging visual input to answer questions that may require complex reasoning over entities.

Question Answering Visual Question Answering

Paper
Add Code

Knowledge Distillation for Quality Estimation

1 code implementation • Findings (ACL) 2021 • Amit Gajbhiye, Marina Fomicheva, Fernando Alva-Manchego, Frédéric Blain, Abiola Obamuyide, Nikolaos Aletras, Lucia Specia

Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations, making it applicable in real-time settings, such as translating online social media conversations.

Data Augmentation Knowledge Distillation +2

Paper
Code

Analyzing Online Political Advertisements

no code implementations • Findings (ACL) 2021 • Danae Sánchez Villegas, Saeid Mokaram, Nikolaos Aletras

Finally, we provide an in-depth analysis of the limitations of our best-performing models and linguistic analysis to study the characteristics of political ads discourse.

Paper
Add Code

On the Ethical Limits of Natural Language Processing on Legal Text

no code implementations • Findings (ACL) 2021 • Dimitrios Tsarapatsanis, Nikolaos Aletras

Natural language processing (NLP) methods for analyzing legal text offer legal scholars and practitioners a range of tools allowing to empirically analyze law on a large scale.

Paper
Add Code

Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification

1 code implementation • ACL 2021 • George Chrysostomou, Nikolaos Aletras

In this paper, we seek to improve the faithfulness of attention-based explanations for text classification.

text-classification Text Classification

Paper
Code

On the Importance of Effectively Adapting Pretrained Language Models for Active Learning

1 code implementation • ACL 2022 • Katerina Margatina, Loïc Barrault, Nikolaos Aletras

Recent Active Learning (AL) approaches in Natural Language Processing (NLP) proposed using off-the-shelf pretrained language models (LMs).

Active Learning Natural Language Understanding

112

Paper
Code

Flexible Instance-Specific Rationalization of NLP Models

1 code implementation • 16 Apr 2021 • George Chrysostomou, Nikolaos Aletras

Recent research on model interpretability in natural language processing extensively uses feature scoring methods for identifying which parts of the input are the most important for a model to make a prediction (i. e. explanation or rationale).

General Classification text-classification +1

Paper
Code

Paragraph-level Rationale Extraction through Regularization: A case study on European Court of Human Rights Cases

no code implementations • NAACL 2021 • Ilias Chalkidis, Manos Fergadiotis, Dimitrios Tsarapatsanis, Nikolaos Aletras, Ion Androutsopoulos, Prodromos Malakasiotis

We also release a new dataset comprising European Court of Human Rights cases, including annotations for paragraph-level rationales.

Paper
Add Code

Modeling the Severity of Complaints in Social Media

1 code implementation • NAACL 2021 • Mali Jin, Nikolaos Aletras

The speech act of complaining is used by humans to communicate a negative mismatch between reality and expectations as a reaction to an unfavorable situation.

Paper
Code

Complaint Identification in Social Media with Transformer Networks

no code implementations • COLING 2020 • Mali Jin, Nikolaos Aletras

Complaining is a speech act extensively used by humans to communicate a negative inconsistency between reality and expectations.

Paper
Add Code

LEGAL-BERT: The Muppets straight out of Law School

no code implementations • Findings of the Association for Computational Linguistics 2020 • Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, Ion Androutsopoulos

Thus we propose a systematic investigation of the available strategies when applying BERT in specialised domains.

Paper
Add Code

An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels

1 code implementation • EMNLP 2020 • Ilias Chalkidis, Manos Fergadiotis, Sotiris Kotitsas, Prodromos Malakasiotis, Nikolaos Aletras, Ion Androutsopoulos

Furthermore, we show that Transformer-based approaches outperform the state-of-the-art in two of the datasets, and we propose a new state-of-the-art method which combines BERT with LWANs.

Multi-Label Classification Multi Label Text Classification +5

Paper
Code

Point-of-Interest Type Inference from Social Media Text

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Danae Sánchez Villegas, Daniel Preoţiuc-Pietro, Nikolaos Aletras

Physical places help shape how we perceive the experiences we have there.

Recommendation Systems Vocal Bursts Type Prediction

Paper
Add Code

Automatic Generation of Topic Labels

1 code implementation • 29 May 2020 • Areej Alokaili, Nikolaos Aletras, Mark Stevenson

A topic is usually represented by a list of terms ranked by their probability but, since these can be difficult to interpret, various approaches have been developed to assign descriptive labels to topics.

Descriptive Information Retrieval +3

Paper
Code

Unsupervised Quality Estimation for Neural Machine Translation

3 code implementations • 21 May 2020 • Marina Fomicheva, Shuo Sun, Lisa Yankovskaya, Frédéric Blain, Francisco Guzmán, Mark Fishel, Nikolaos Aletras, Vishrav Chaudhary, Lucia Specia

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time.

Machine Translation Translation +1

29,192

Paper
Code

Analyzing Political Parody in Social Media

no code implementations • ACL 2020 • Antonis Maronikolakis, Danae Sanchez Villegas, Daniel Preotiuc-Pietro, Nikolaos Aletras

Parody is a figurative device used to imitate an entity for comedic or critical purposes and represents a widespread phenomenon in social media through many popular parody accounts.

Fact Checking Sentiment Analysis

Paper
Add Code

Journalist-in-the-Loop: Continuous Learning as a Service for Rumour Analysis

no code implementations • IJCNLP 2019 • Twin Karmakharm, Nikolaos Aletras, Kalina Bontcheva

The system features a rumour annotation service that allows journalists to easily provide feedback for a given social media post through a web-based interface.

Rumour Detection

Paper
Add Code

Automatically Identifying Complaints in Social Media

1 code implementation • ACL 2019 • Daniel Preotiuc-Pietro, Mihaela Gaman, Nikolaos Aletras

Complaining is a basic speech act regularly used in human and computer mediated communication to express a negative mismatch between reality and expectations in a particular situation.

Paper
Code

Neural Legal Judgment Prediction in English

no code implementations • ACL 2019 • Ilias Chalkidis, Ion Androutsopoulos, Nikolaos Aletras

Legal judgment prediction is the task of automatically predicting the outcome of a court case, given a text describing the case's facts.

Ranked #1 on Binary text classification on ECHR Non-Anonymized

Binary text classification General Classification +1

Paper
Add Code

Extreme Multi-Label Legal Text Classification: A case study in EU Legislation

no code implementations • WS 2019 • Ilias Chalkidis, Manos Fergadiotis, Prodromos Malakasiotis, Nikolaos Aletras, Ion Androutsopoulos

We consider the task of Extreme Multi-Label Text Classification (XMTC) in the legal domain.

General Classification Multi Label Text Classification +3

Paper
Add Code

Re-Ranking Words to Improve Interpretability of Automatically Generated Topics

1 code implementation • WS 2019 • Areej Alokaili, Nikolaos Aletras, Mark Stevenson

Making their output interpretable is an important area of research with applications to areas such as the enhancement of exploratory search interfaces and the development of interpretable machine learning models.

Interpretable Machine Learning Re-Ranking +1

Paper
Code

Graph Node-Feature Convolution for Representation Learning

2 code implementations • 30 Nov 2018 • Li Zhang, Heda Song, Nikolaos Aletras, Haiping Lu

Graph convolutional network (GCN) is an emerging neural network approach.

Node Classification Representation Learning

Paper
Code

Nowcasting the Stance of Social Media Users in a Sudden Vote: The Case of the Greek Referendum

no code implementations • 26 Aug 2018 • Adam Tsakalidis, Nikolaos Aletras, Alexandra I. Cristea, Maria Liakata

Modelling user voting intention in social media is an important research area, with applications in analysing electorate behaviour, online political campaigning and advertising.

Paper
Add Code

Predicting Twitter User Socioeconomic Attributes with Network and Language Information

1 code implementation • 11 Apr 2018 • Nikolaos Aletras, Benjamin Paul Chamberlain

Inferring socioeconomic attributes of social media users such as occupation and income is an important problem in computational social science.

Recommendation Systems

Paper
Code

Multimodal Topic Labelling

no code implementations • EACL 2017 • Ionut Sorodoc, Jey Han Lau, Nikolaos Aletras, Timothy Baldwin

Automatic topic labelling is the task of generating a succinct label that summarises the theme or subject of a topic, with the intention of reducing the cognitive load of end-users when interpreting these topics.

Topic Models