However, QE models are often trained on noisy approximations of quality annotations derived from the proportion of post-edited words in translated sentences instead of direct human annotations of translation errors.
Pruning and quantization form the foundation of model compression for neural networks, enabling efficient inference for large language models (LLMs).
Surprisingly, we find that pruned LLMs hallucinate less compared to their full-sized counterparts.
Understanding how and what pre-trained language models (PLMs) learn about language is an open challenge in natural language processing.
Scaling pre-trained language models has resulted in large performance gains in various natural language processing tasks but comes with a large cost in memory requirements.
Regulation studies are a rich source of knowledge on how to systematically deal with risk and uncertainty, as well as with scientific evidence, to evaluate and compare regulatory options.
A crucial aspect of a rumor detection model is its ability to generalize, particularly its ability to detect emerging, previously unknown rumors.
The extensive memory footprint of pre-trained language models (PLMs) can hinder deployment in memory-constrained settings, such as cloud environments or on-device.
Effectively leveraging multimodal information from social media posts is essential to various downstream tasks such as sentiment analysis, sarcasm detection and hate speech classification.
Influencer marketing involves a wide range of strategies in which brands collaborate with popular content creators (i. e., influencers) to leverage their reach, trust, and impact on their audience to promote and endorse products or services.
Further, it employs a fulfillment representation layer for learning how many task attributes have been fulfilled in the dialogue, an importance predictor component for calculating the importance of task attributes.
The remarkable advancements in large language models (LLMs) have significantly enhanced the performance in few-shot learning settings.
Instruction-tuned Large Language Models (LLMs) have exhibited impressive language understanding and the capacity to generate responses that follow specific prompts.
Semi-supervised learning (SSL) is a popular setting aiming to effectively utilize unlabelled data to improve model performance in downstream natural language processing (NLP) tasks.
Active learning (AL) is a human-and-model-in-the-loop paradigm that iteratively selects informative unlabeled data for human annotation, aiming to improve over random sampling.
State-of-the-art target-oriented opinion word extraction (TOWE) models typically use BERT-based text encoders that operate on the word level, along with graph convolutional networks (GCNs) that incorporate syntactic information extracted from syntax trees.
Widely used faithfulness metrics, such as sufficiency and comprehensiveness use a hard erasure criterion, i. e. entirely removing or retaining the top most important tokens ranked by a given FA and observing the changes in predictive likelihood.
Inspired by the theoretical foundations in domain adaptation , we propose a new SSL approach that opts for selecting target samples whose model output from a domain-specific teacher and student network disagree on the unlabelled target data, in an effort to boost the target domain performance.
Explanation faithfulness of model predictions in natural language processing is typically evaluated on held-out data from the same temporal distribution as the training data (i. e. synchronous settings).
Graph-based text representation focuses on how text documents are represented as graphs for exploiting dependency information between tokens and documents within a corpus.
State-of-the-art approaches for hate-speech detection usually exhibit poor performance in out-of-domain settings.
The phenomenon of misinformation spreading in social media has developed a new form of active citizens who focus on tackling the problem by refuting posts that might contain misinformation.
Thus, in this paper, we propose a Hierarchical N-Gram framework for Zero-Shot Link Prediction (HNZSLP), which considers the dependencies among character n-grams of the relation surface name for ZSLP.
In this paper, we propose to automatically identify and reduce spurious correlations using attribution methods with dynamic refinement of the list of terms that need to be regularized during training.
Several pre-training objectives, such as masked language modeling (MLM), have been proposed to pre-train language models (e. g. BERT) with the aim of learning better language representations.
Bragging is a speech act employed with the goal of constructing a favorable self-image through positive statements about oneself.
Recent work in Natural Language Processing has focused on developing approaches that extract faithful explanations, either via identifying the most important tokens in the input (i. e. post-hoc explanations) or by designing inherently faithful models that first select the most important tokens and then use them to predict the correct label (i. e. select-then-predict models).
Laws and their interpretations, legal arguments and agreements\ are typically expressed in writing, leading to the production of vast corpora of legal text.
Ranked #1 on Natural Language Understanding on LexGLUE
Common acquisition functions for active learning use either uncertainty or diversity sampling, aiming to select difficult and diverse data points from the pool of unlabeled data, respectively.
Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural language processing for learning text representations.
Target-oriented opinion words extraction (TOWE) (Fan et al., 2019b) is a new subtask of target-oriented sentiment analysis that aims to extract opinion words for a given aspect in text.
Point-of-interest (POI) type prediction is the task of inferring the type of a place from where a social media post was shared.
In this paper, we hypothesize that salient information extracted a priori from the training data can complement the task-specific information learned by the model during fine-tuning on a downstream task.
Recent Quality Estimation (QE) models based on multilingual pre-trained representations have achieved very competitive results when predicting the overall quality of translated sentences.
Visual Question Answering (VQA) methods aim at leveraging visual input to answer questions that may require complex reasoning over entities.
Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations, making it applicable in real-time settings, such as translating online social media conversations.
Finally, we provide an in-depth analysis of the limitations of our best-performing models and linguistic analysis to study the characteristics of political ads discourse.
Natural language processing (NLP) methods for analyzing legal text offer legal scholars and practitioners a range of tools allowing to empirically analyze law on a large scale.
In this paper, we seek to improve the faithfulness of attention-based explanations for text classification.
Recent Active Learning (AL) approaches in Natural Language Processing (NLP) proposed using off-the-shelf pretrained language models (LMs).
Recent research on model interpretability in natural language processing extensively uses feature scoring methods for identifying which parts of the input are the most important for a model to make a prediction (i. e. explanation or rationale).
We also release a new dataset comprising European Court of Human Rights cases, including annotations for paragraph-level rationales.
Thus we propose a systematic investigation of the available strategies when applying BERT in specialised domains.
Furthermore, we show that Transformer-based approaches outperform the state-of-the-art in two of the datasets, and we propose a new state-of-the-art method which combines BERT with LWANs.
A topic is usually represented by a list of terms ranked by their probability but, since these can be difficult to interpret, various approaches have been developed to assign descriptive labels to topics.
Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time.
Parody is a figurative device used to imitate an entity for comedic or critical purposes and represents a widespread phenomenon in social media through many popular parody accounts.
The system features a rumour annotation service that allows journalists to easily provide feedback for a given social media post through a web-based interface.
Complaining is a basic speech act regularly used in human and computer mediated communication to express a negative mismatch between reality and expectations in a particular situation.
Legal judgment prediction is the task of automatically predicting the outcome of a court case, given a text describing the case's facts.
Ranked #1 on Binary text classification on ECHR Non-Anonymized
We consider the task of Extreme Multi-Label Text Classification (XMTC) in the legal domain.
Making their output interpretable is an important area of research with applications to areas such as the enhancement of exploratory search interfaces and the development of interpretable machine learning models.
Modelling user voting intention in social media is an important research area, with applications in analysing electorate behaviour, online political campaigning and advertising.
Inferring socioeconomic attributes of social media users such as occupation and income is an important problem in computational social science.
Automatic topic labelling is the task of generating a succinct label that summarises the theme or subject of a topic, with the intention of reducing the cognitive load of end-users when interpreting these topics.