Search Results for author: Isabelle Augenstein

Found 115 papers, 49 papers with code

Paper
Add Code

Seed Selection for Distantly Supervised Web-Based Relation Extraction

no code implementations • WS 2014 • Isabelle Augenstein

Relation Relation Extraction

Paper
Add Code

Extracting Relations between Non-Standard Entities using Distant Supervision and Imitation Learning

no code implementations • EMNLP 2015 • Isabelle Augenstein, Andreas Vlachos, Diana Maynard

Imitation Learning Information Retrieval +2

Paper
Add Code

USFD: Twitter NER with Drift Compensation and Linked Data

no code implementations • WS 2015 • Leon Derczynski, Isabelle Augenstein, Kalina Bontcheva

This paper describes a pilot NER system for Twitter, comprising the USFD system entry to the W-NUT 2015 NER shared task.

Clustering NER

Paper
Add Code

Monolingual Social Media Datasets for Detecting Contradiction and Entailment

no code implementations • LREC 2016 • Piroska Lendvai, Isabelle Augenstein, Kalina Bontcheva, Thierry Declerck

Entailment recognition approaches are useful for application domains such as information extraction, question answering or summarisation, for which evidence from multiple sentences needs to be combined.

Natural Language Inference Question Answering +1

Paper
Add Code

USFD at SemEval-2016 Task 6: Any-Target Stance Detection on Twitter with Autoencoders

no code implementations • SEMEVAL 2016 • Isabelle Augenstein, Andreas Vlachos, Kalina Bontcheva

Natural Language Inference Sentiment Analysis +1

Paper
Add Code

Stance Detection with Bidirectional Conditional Encoding

1 code implementation • EMNLP 2016 • Isabelle Augenstein, Tim Rocktäschel, Andreas Vlachos, Kalina Bontcheva

Stance detection is the task of classifying the attitude expressed in a text towards a target such as Hillary Clinton to be "positive", negative" or "neutral".

Stance Detection

Paper
Code

Numerically Grounded Language Models for Semantic Error Correction

no code implementations • EMNLP 2016 • Georgios P. Spithourakis, Isabelle Augenstein, Sebastian Riedel

Semantic error detection and correction is an important task for applications such as fact checking, speech-to-text or grammatical error correction.

Fact Checking Grammatical Error Correction +1

Paper
Add Code

emoji2vec: Learning Emoji Representations from their Description

7 code implementations • WS 2016 • Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, Sebastian Riedel

Many current natural language processing applications for social media rely on representation learning and utilize pre-trained word embeddings.

Representation Learning Sentiment Analysis +1

254

Paper
Code

Generalisation in Named Entity Recognition: A Quantitative Analysis

no code implementations • 11 Jan 2017 • Isabelle Augenstein, Leon Derczynski, Kalina Bontcheva

Unseen NEs, in particular, play an important role, which have a higher incidence in diverse genres such as social media than in more regular genres such as newswire.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Multi-Task Learning of Keyphrase Boundary Classification

no code implementations • ACL 2017 • Isabelle Augenstein, Anders Søgaard

Keyphrase boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to predefined types.

Classification General Classification +1

Paper
Add Code

SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications

1 code implementation • SEMEVAL 2017 • Isabelle Augenstein, Mrinal Das, Sebastian Riedel, Lakshmi Vikraman, Andrew McCallum

We describe the SemEval task of extracting keyphrases and relations between them from scientific documents, which is crucial for understanding which publications describe which processes, tasks and materials.

Knowledge Base Population

196

Paper
Code

Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM

1 code implementation • SEMEVAL 2017 • Elena Kochkina, Maria Liakata, Isabelle Augenstein

This paper describes team Turing's submission to SemEval 2017 RumourEval: Determining rumour veracity and support for rumours (SemEval 2017 Task 8, Subtask A).

Ranked #1 on Stance Detection on RumourEval

General Classification Rumour Detection +2

Paper
Code

Latent Multi-task Architecture Learning

2 code implementations • 23 May 2017 • Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, Anders Søgaard

In practice, however, MTL involves searching an enormous space of possible parameter sharing architectures to find (a) the layers or subspaces that benefit from sharing, (b) the appropriate amount of sharing, and (c) the appropriate relative weights of the different task losses.

Multi-Task Learning

152

Paper
Code

A Supervised Approach to Extractive Summarisation of Scientific Papers

2 code implementations • CONLL 2017 • Ed Collins, Isabelle Augenstein, Sebastian Riedel

Automatic summarisation is a popular approach to reduce a document to its main arguments.

Sentence

254

Paper
Code

A simple but tough-to-beat baseline for the Fake News Challenge stance detection task

9 code implementations • 11 Jul 2017 • Benjamin Riedel, Isabelle Augenstein, Georgios P. Spithourakis, Sebastian Riedel

Identifying public misinformation is a complicated and challenging task.

Ranked #5 on Fake News Detection on FNC-1

Fact Checking Misinformation +1

167

Paper
Code

Tracking Typological Traits of Uralic Languages in Distributed Language Representations

no code implementations • WS 2018 • Johannes Bjerva, Isabelle Augenstein

Although linguistic typology has a long history, computational approaches have only recently gained popularity.

Paper
Add Code

Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers

no code implementations • 6 Dec 2017 • Arkaitz Zubiaga, Elena Kochkina, Maria Liakata, Rob Procter, Michal Lukasik, Kalina Bontcheva, Trevor Cohn, Isabelle Augenstein

We show that sequential classifiers that exploit the use of discourse properties in social media conversations while using only local features, outperform non-sequential classifiers.

General Classification Stance Classification

Paper
Add Code

From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings

no code implementations • NAACL 2018 • Johannes Bjerva, Isabelle Augenstein

A core part of linguistic typology is the classification of languages according to linguistic properties, such as those detailed in the World Atlas of Language Structure (WALS).

Morphological Inflection Part-Of-Speech Tagging

Paper
Add Code

Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces

1 code implementation • NAACL 2018 • Isabelle Augenstein, Sebastian Ruder, Anders Søgaard

We combine multi-task learning and semi-supervised learning by inducing a joint embedding space between disparate label spaces and learning transfer functions between label embeddings, enabling us to jointly leverage unlabelled data and auxiliary, annotated datasets.

General Classification Multi-Task Learning +1

Paper
Code

KU-MTL at SemEval-2018 Task 1: Multi-task Identification of Affect in Tweets

no code implementations • SEMEVAL 2018 • Thomas Nyegaard-Signori, Casper Veistrup Helms, Johannes Bjerva, Isabelle Augenstein

We take a multi-task learning approach to the shared Task 1 at SemEval-2018.

Emotion Classification General Classification +2

Paper
Add Code

Jack the Reader - A Machine Reading Framework

2 code implementations • 20 Jun 2018 • Dirk Weissenborn, Pasquale Minervini, Tim Dettmers, Isabelle Augenstein, Johannes Welbl, Tim Rocktäschel, Matko Bošnjak, Jeff Mitchell, Thomas Demeester, Pontus Stenetorp, Sebastian Riedel

For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions.

Link Prediction Natural Language Inference +3

258

Paper
Code

Character-level Supervision for Low-resource POS Tagging

no code implementations • WS 2018 • Katharina Kann, Johannes Bjerva, Isabelle Augenstein, Barbara Plank, Anders S{\o}gaard

Neural part-of-speech (POS) taggers are known to not perform well with little training data.

Feature Engineering LEMMA +4

Paper
Add Code

Jack the Reader -- A Machine Reading Framework

1 code implementation • ACL 2018 • Dirk Weissenborn, Pasquale Minervini, Isabelle Augenstein, Johannes Welbl, Tim Rockt{\"a}schel, Matko Bo{\v{s}}njak, Jeff Mitchell, Thomas Demeester, Tim Dettmers, Pontus Stenetorp, Sebastian Riedel

For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions.

Information Retrieval Link Prediction +4

258

Paper
Code

A strong baseline for question relevancy ranking

no code implementations • EMNLP 2018 • Ana V. González-Garduño, Isabelle Augenstein, Anders Søgaard

The best systems at the SemEval-16 and SemEval-17 community question answering shared tasks -- a task that amounts to question relevancy ranking -- involve complex pipelines and manual feature engineering.

Community Question Answering Feature Engineering

Paper
Add Code

Parameter sharing between dependency parsers for related languages

1 code implementation • EMNLP 2018 • Miryam de Lhoneux, Johannes Bjerva, Isabelle Augenstein, Anders Søgaard

We find that sharing transition classifier parameters always helps, whereas the usefulness of sharing word and/or character LSTM parameters varies.

Paper
Code

Nightmare at test time: How punctuation prevents parsers from generalizing

no code implementations • WS 2018 • Anders Søgaard, Miryam de Lhoneux, Isabelle Augenstein

Punctuation is a strong indicator of syntactic structure, and parsers trained on text with punctuation often rely heavily on this signal.

Paper
Add Code

Copenhagen at CoNLL--SIGMORPHON 2018: Multilingual Inflection in Context with Explicit Morphosyntactic Decoding

no code implementations • CONLL 2018 • Yova Kementchedjhieva, Johannes Bjerva, Isabelle Augenstein

This paper documents the Team Copenhagen system which placed first in the CoNLL--SIGMORPHON 2018 shared task on universal morphological reinflection, Task 2 with an overall accuracy of 49. 87.

LEMMA Morphological Inflection +2

Paper
Add Code

What do Language Representations Really Represent?

no code implementations • CL 2019 • Johannes Bjerva, Robert Östling, Maria Han Veiga, Jörg Tiedemann, Isabelle Augenstein

If the corpus is multilingual, the same model can be used to learn distributed representations of languages, such that similar languages end up with similar representations.

Language Modelling Translation

Paper
Add Code

A Probabilistic Generative Model of Linguistic Typology

1 code implementation • NAACL 2019 • Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein

In the principles-and-parameters framework, the structural features of languages depend on parameters that may be toggled on or off, with a single parameter often dictating the status of multiple features.

Paper
Code

Combining Sentiment Lexica with a Multi-View Variational Autoencoder

1 code implementation • NAACL 2019 • Alexander Hoyle, Lawrence Wolf-Sonkin, Hanna Wallach, Ryan Cotterell, Isabelle Augenstein

When assigning quantitative labels to a dataset, different methodologies may rely on different scales.

General Classification Sentiment Analysis +2

Paper
Code

Issue Framing in Online Discussion Fora

no code implementations • NAACL 2019 • Mareike Hartmann, Tallulah Jansen, Isabelle Augenstein, Anders Søgaard

In online discussion fora, speakers often make arguments for or against something, say birth control, by highlighting certain aspects of the topic.

Paper
Add Code

Unsupervised Discovery of Gendered Language through Latent-Variable Modeling

no code implementations • ACL 2019 • Alexander Hoyle, Wolf-Sonkin, Hanna Wallach, Isabelle Augenstein, Ryan Cotterell

Studying the ways in which language is gendered has long been an area of interest in sociolinguistics.

Paper
Add Code

Uncovering Probabilistic Implications in Typological Knowledge Bases

no code implementations • ACL 2019 • Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein

The study of linguistic typology is rooted in the implications we find between linguistic features, such as the fact that languages with object-verb word ordering tend to have post-positions.

Knowledge Base Population

Paper
Add Code

X-WikiRE: A Large, Multilingual Resource for Relation Extraction as Machine Comprehension

1 code implementation • WS 2019 • Mostafa Abdou, Cezar Sas, Rahul Aralikatte, Isabelle Augenstein, Anders Søgaard

Although the vast majority of knowledge bases KBs are heavily biased towards English, Wikipedias do cover very different topics in different languages.

Reading Comprehension Relation +1

Paper
Code

Transductive Auxiliary Task Self-Training for Neural Multi-Task Models

no code implementations • WS 2019 • Johannes Bjerva, Katharina Kann, Isabelle Augenstein

Multi-task learning and self-training are two common ways to improve a machine learning model's performance in settings with limited training data.

Multi-Task Learning

Paper
Add Code

MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

no code implementations • IJCNLP 2019 • Isabelle Augenstein, Christina Lioma, Dongsheng Wang, Lucas Chaves Lima, Casper Hansen, Christian Hansen, Jakob Grue Simonsen

We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim verification.

Claim Verification Fact Checking

Paper
Add Code

Back to the Future -- Sequential Alignment of Text Representations

1 code implementation • 8 Sep 2019 • Johannes Bjerva, Wouter Kouw, Isabelle Augenstein

In particular, language evolution causes data drift between time-steps in sequential decision-making tasks.

Decision Making Rumour Detection

Paper
Code

Domain Transfer in Dialogue Systems without Turn-Level Supervision

1 code implementation • 16 Sep 2019 • Joachim Bingel, Victor Petrén Bach Hansen, Ana Valeria Gonzalez, Paweł Budzianowski, Isabelle Augenstein, Anders Søgaard

Task oriented dialogue systems rely heavily on specialized dialogue state tracking (DST) modules for dynamically predicting user intent throughout the conversation.

Dialogue State Tracking Task-Oriented Dialogue Systems

Paper
Code

Retrieval-based Goal-Oriented Dialogue Generation

no code implementations • 30 Sep 2019 • Ana Valeria Gonzalez, Isabelle Augenstein, Anders Søgaard

Most research on dialogue has focused either on dialogue generation for openended chit chat or on state tracking for goal-directed dialogue.

Dialogue Generation Retrieval

Paper
Add Code

Mapping (Dis-)Information Flow about the MH17 Plane Crash

1 code implementation • WS 2019 • Mareike Hartmann, Yevgeniy Golovchenko, Isabelle Augenstein

In this work, we examine to what extent text classifiers can be used to label data for subsequent content analysis, in particular we focus on predicting pro-Russian and pro-Ukrainian Twitter content related to the MH17 plane crash.

Paper
Code

Joint Emotion Label Space Modelling for Affect Lexica

no code implementations • 20 Nov 2019 • Luna De Bruyne, Pepa Atanasova, Isabelle Augenstein

Emotion lexica are commonly used resources to combat data poverty in automatic emotion detection.

Emotion Recognition valid

Paper
Add Code

TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP

2 code implementations • 2 Dec 2019 • Nils Rethmeier, Vageesh Kumar Saxena, Isabelle Augenstein

While state-of-the-art NLP explainability (XAI) methods focus on explaining per-sample decisions in supervised end or probing tasks, this is insufficient to explain and quantify model knowledge transfer during (un-)supervised training.

Explainable Artificial Intelligence (XAI) Model Compression +1

Paper
Code

Claim Check-Worthiness Detection as Positive Unlabelled Learning

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Dustin Wright, Isabelle Augenstein

In applying this, we out-perform the state of the art in two of the three tasks studied for claim check-worthiness detection in English.

Fact Checking Rumour Detection +1

Paper
Code

Zero-Shot Cross-Lingual Transfer with Meta Learning

1 code implementation • EMNLP 2020 • Farhad Nooralahzadeh, Giannis Bekoulis, Johannes Bjerva, Isabelle Augenstein

We show that this challenging setup can be approached using meta-learning, where, in addition to training a source language model, another model learns to select which training instances are the most beneficial to the first.

Few-Shot NLI Language Modelling +5

Paper
Code

Generating Fact Checking Explanations

no code implementations • ACL 2020 • Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

Most existing work on automated fact checking is concerned with predicting the veracity of claims based on metadata, social network spread, language used in claims, and, more recently, evidence supporting or denying claims.

Fact Checking Informativeness

Paper
Add Code

SubjQA: A Dataset for Subjectivity and Review Comprehension

1 code implementation • EMNLP 2020 • Johannes Bjerva, Nikita Bhutani, Behzad Golshan, Wang-Chiew Tan, Isabelle Augenstein

We find that subjectivity is also an important feature in the case of QA, albeit with more intricate interactions between subjectivity and QA performance.

Question Answering Sentiment Analysis +1

Paper
Code

2kenize: Tying Subword Sequences for Chinese Script Conversion

1 code implementation • ACL 2020 • Pranav A, Isabelle Augenstein

Simplified Chinese to Traditional Chinese character conversion is a common preprocessing step in Chinese NLP.

General Classification Topic Classification

Paper
Code

Inducing Language-Agnostic Multilingual Representations

1 code implementation • Joint Conference on Lexical and Computational Semantics 2021 • Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein

Cross-lingual representations have the potential to make NLP techniques available to the vast majority of languages in the world.

Sentence XLM-R +1

Paper
Code

Multi-Hop Fact Checking of Political Claims

1 code implementation • 10 Sep 2020 • Wojciech Ostrowski, Arnav Arora, Pepa Atanasova, Isabelle Augenstein

We: 1) construct a small annotated dataset, PolitiHop, of evidence sentences for claim verification; 2) compare it to existing multi-hop datasets; and 3) study how to transfer knowledge from more extensive in- and out-of-domain resources to PolitiHop.

Claim Verification Fact Checking +1

Paper
Code

Time-Aware Evidence Ranking for Fact-Checking

no code implementations • 10 Sep 2020 • Liesbeth Allein, Isabelle Augenstein, Marie-Francine Moens

Truth can vary over time.

Clustering Fact Checking +4

Paper
Add Code

Transformer Based Multi-Source Domain Adaptation

1 code implementation • EMNLP 2020 • Dustin Wright, Isabelle Augenstein

Here, we investigate the problem of unsupervised multi-source domain adaptation, where a model is trained on labelled data from multiple source domains and must make predictions on a domain for which no labelled data has been seen.

Domain Adaptation

Paper
Code

Generating Label Cohesive and Well-Formed Adversarial Claims

1 code implementation • EMNLP 2020 • Pepa Atanasova, Dustin Wright, Isabelle Augenstein

However, for inference tasks such as fact checking, these triggers often inadvertently invert the meaning of instances they are inserted in.

Fact Checking Language Modelling +2

Paper
Code

A Diagnostic Study of Explainability Techniques for Text Classification

1 code implementation • EMNLP 2020 • Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

Recent developments in machine learning have introduced models that approach human performance at the cost of increased architectural complexity.

General Classification text-classification +1

Paper
Code

Self-supervised Contrastive Zero to Few-shot Learning from Small, Long-tailed Text data

no code implementations • 28 Sep 2020 • Nils Rethmeier, Isabelle Augenstein

We thus approach pretraining from a miniaturisation perspective, such as not to require massive external data sources and models, or learned translations from continuous input embeddings to discrete labels.

Few-Shot Learning Multi Label Text Classification +2

Paper
Add Code

Data-Efficient Pretraining via Contrastive Self-Supervision

no code implementations • 2 Oct 2020 • Nils Rethmeier, Isabelle Augenstein

For natural language processing `text-to-text' tasks, the prevailing approaches heavily rely on pretraining large self-supervised models on increasingly larger `task-external' data.

Fairness Few-Shot Learning +3

Paper
Add Code

Unsupervised Evaluation for Question Answering with Transformers

no code implementations • EMNLP (BlackboxNLP) 2020 • Lukas Muttenthaler, Isabelle Augenstein, Johannes Bjerva

We observe a consistent pattern in the answer representations, which we show can be used to automatically evaluate whether or not a predicted answer span is correct.

Question Answering

Paper
Add Code

What Can We Do to Improve Peer Review in NLP?

no code implementations • Findings of the Association for Computational Linguistics 2020 • Anna Rogers, Isabelle Augenstein

Peer review is our best tool for judging the quality of conference submissions, but it is becoming increasingly spurious.

Paper
Add Code

SIGTYP 2020 Shared Task: Prediction of Typological Features

no code implementations • EMNLP (SIGTYP) 2020 • Johannes Bjerva, Elizabeth Salesky, Sabrina J. Mielke, Aditi Chaudhary, Giuseppe G. A. Celano, Edoardo M. Ponti, Ekaterina Vylomova, Ryan Cotterell, Isabelle Augenstein

Typological knowledge bases (KBs) such as WALS (Dryer and Haspelmath, 2013) contain information about linguistic properties of the world's languages.

Cross-Lingual Transfer Transfer Learning

Paper
Add Code

Multi-Sense Language Modelling

no code implementations • NAACL (DistCurate) 2022 • Andrea Lekkas, Peter Schneider-Kamp, Isabelle Augenstein

The effectiveness of a language model is influenced by its token representations, which must encode contextual information and handle the same word form having a plurality of meanings (polysemy).

Graph Attention Language Modelling +1

Paper
Add Code

Longitudinal Citation Prediction using Temporal Graph Neural Networks

no code implementations • 10 Dec 2020 • Andreas Nugaard Holm, Barbara Plank, Dustin Wright, Isabelle Augenstein

Citation count prediction is the task of predicting the number of citations a paper has gained after a period of time.

Citation Prediction

Paper
Add Code

Disembodied Machine Learning: On the Illusion of Objectivity in NLP

no code implementations • 28 Jan 2021 • Zeerak Waseem, Smarika Lulz, Joachim Bingel, Isabelle Augenstein

In this paper, we contextualise this discourse of bias in the ML community against the subjective choices in the development process.

BIG-bench Machine Learning

Paper
Add Code

Does Typological Blinding Impede Cross-Lingual Sharing?

no code implementations • EACL 2021 • Johannes Bjerva, Isabelle Augenstein

Our hypothesis is that a model trained in a cross-lingual setting will pick up on typological cues from the input data, thus overshadowing the utility of explicitly using such features.

Paper
Add Code

White Paper - Creating a Repository of Objectionable Online Content: Addressing Undesirable Biases and Ethical Considerations

no code implementations • 23 Feb 2021 • Thamar Solorio, Mahsa Shafaei, Christos Smailis, Isabelle Augenstein, Margaret Mitchell, Ingrid Stapf, Ioannis Kakadiaris

This white paper summarizes the authors' structured brainstorming regarding ethical considerations for creating an extensive repository of online content labeled with tags that describe potentially questionable content for young viewers.

Paper
Add Code

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned and Perspectives

no code implementations • 25 Feb 2021 • Nils Rethmeier, Isabelle Augenstein

Contrastive self-supervised training objectives enabled recent successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar.

Contrastive Learning Language Modelling +4

Paper
Add Code

A Survey on Stance Detection for Mis- and Disinformation Identification

no code implementations • Findings (NAACL) 2022 • Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

Understanding attitudes expressed in texts, also known as stance detection, plays an important role in systems for detecting false information online, be it misinformation (unintentionally false) or disinformation (intentionally false information).

Fact Checking Misinformation +3

Paper
Add Code

Detecting Harmful Content On Online Platforms: What Platforms Need Vs. Where Research Efforts Go

no code implementations • 27 Feb 2021 • Arnav Arora, Preslav Nakov, Momchil Hardalov, Sheikh Muhammad Sarwar, Vibha Nayak, Yoan Dinkov, Dimitrina Zlatkova, Kyle Dent, Ameya Bhatawdekar, Guillaume Bouchard, Isabelle Augenstein

The proliferation of harmful content on online platforms is a major societal problem, which comes in many different forms including hate speech, offensive language, bullying and harassment, misinformation, spam, violence, graphic content, sexual abuse, self harm, and many other.

Abusive Language Misinformation

Paper
Add Code

University of Copenhagen Participation in TREC Health Misinformation Track 2020

no code implementations • 3 Mar 2021 • Lucas Chaves Lima, Dustin Brandon Wright, Isabelle Augenstein, Maria Maistro

Our approach consists of 3 steps: (1) we create an initial run with BM25 and RM3; (2) we estimate credibility and misinformation scores for the documents in the initial run; (3) we merge the relevance, credibility and misinformation scores to re-rank documents in the initial run.

Language Modelling Misinformation +1

Paper
Add Code

A Neighbourhood Framework for Resource-Lean Content Flagging

no code implementations • 31 Mar 2021 • Sheikh Muhammad Sarwar, Dimitrina Zlatkova, Momchil Hardalov, Yoan Dinkov, Isabelle Augenstein, Preslav Nakov

The framework is based on a nearest-neighbour architecture.

Abusive Language

Paper
Add Code

Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

1 code implementation • 15 Apr 2021 • Karolina Stańczak, Sagnik Ray Choudhury, Tiago Pimentel, Ryan Cotterell, Isabelle Augenstein

Recent research has demonstrated that large pre-trained language models reflect societal biases expressed in natural language.

Language Modelling Probing Language Models

Paper
Code

Cross-Domain Label-Adaptive Stance Detection

1 code implementation • EMNLP 2021 • Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

In this paper, we perform an in-depth analysis of 16 stance detection datasets, and we explore the possibility for cross-domain learning from them.

Domain Adaptation Stance Detection

Paper
Code

CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding

1 code implementation • Findings (ACL) 2021 • Dustin Wright, Isabelle Augenstein

Scientific document understanding is challenging as the data is highly domain specific and diverse.

document understanding Domain Adaptation +2

Paper
Code

Determining the Credibility of Science Communication

no code implementations • NAACL (sdp) 2021 • Isabelle Augenstein

Most work on scholarly document processing assumes that the information processed is trustworthy and factually correct.

Paper
Add Code

Is Sparse Attention more Interpretable?

no code implementations • ACL 2021 • Clara Meister, Stefan Lazov, Isabelle Augenstein, Ryan Cotterell

Sparse attention has been claimed to increase model interpretability under the assumption that it highlights influential inputs.

text-classification Text Classification

Paper
Add Code

QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension

no code implementations • 27 Jul 2021 • Anna Rogers, Matt Gardner, Isabelle Augenstein

Alongside huge volumes of research on deep learning models in NLP in the recent years, there has been also much work on benchmark datasets needed to track modeling progress.

Question Answering Reading Comprehension

Paper
Add Code

Towards Explainable Fact Checking

no code implementations • 23 Aug 2021 • Isabelle Augenstein

This development has spurred research in the area of automatic fact checking, from approaches to detect check-worthy claims and determining the stance of tweets towards claims, to methods to determine the veracity of claims given evidence documents.

Decision Making Fact Checking +2

Paper
Add Code

Semi-Supervised Exaggeration Detection of Health Science Press Releases

1 code implementation • EMNLP 2021 • Dustin Wright, Isabelle Augenstein

Given this, we present a formalization of and study into the problem of exaggeration detection in science communication.

Benchmarking Few-Shot Learning

Paper
Code

Diagnostics-Guided Explanation Generation

no code implementations • 8 Sep 2021 • Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

When such annotations are not available, explanations are often selected as those portions of the input that maximise a downstream task's performance, which corresponds to optimising an explanation's Faithfulness to a given model.

Explanation Generation Sentence

Paper
Add Code

Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training

1 code implementation • 13 Sep 2021 • Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

Most research in stance detection, however, has been limited to working with a single language and on a few limited targets, with little work on cross-lingual stance detection.

Stance Detection

Paper
Code

Can Edge Probing Tasks Reveal Linguistic Knowledge in QA Models?

no code implementations • 15 Sep 2021 • Sagnik Ray Choudhury, Nikita Bhutani, Isabelle Augenstein

We find that EP test results do not change significantly when the fine-tuned model performs well or in adversarial situations where the model is forced to learn wrong correlations.

Question Answering

Paper
Add Code

Generating Fluent Fact Checking Explanations with Unsupervised Post-Editing

no code implementations • 13 Dec 2021 • Shailza Jolly, Pepa Atanasova, Isabelle Augenstein

In addition, we show the applicability of our approach in a completely unsupervised setting.

Explanation Generation Extractive Summarization +2

Paper
Add Code

Quantifying Gender Biases Towards Politicians on Reddit

1 code implementation • 22 Dec 2021 • Sara Marjanovic, Karolina Stańczak, Isabelle Augenstein

Rather than overt hostile or benevolent sexism, the results of the nominal and lexical analyses suggest this interest is not as professional or respectful as that expressed about male politicians.

Bias Detection Gender Bias Detection

Paper
Code

A Survey on Gender Bias in Natural Language Processing

no code implementations • 28 Dec 2021 • Karolina Stanczak, Isabelle Augenstein

3) Despite a myriad of papers on gender bias in NLP methods, we find that most of the newly developed algorithms do not test their models for bias and disregard possible ethical considerations of their work.

Paper
Add Code

A Latent-Variable Model for Intrinsic Probing

2 code implementations • 20 Jan 2022 • Karolina Stańczak, Lucas Torroba Hennigen, Adina Williams, Ryan Cotterell, Isabelle Augenstein

The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic information.

Attribute

Paper
Code

Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings

1 code implementation • 14 Feb 2022 • Malte Ostendorff, Nils Rethmeier, Isabelle Augenstein, Bela Gipp, Georg Rehm

Learning scientific document representations can be substantially improved through contrastive learning objectives, where the challenge lies in creating positive and negative training samples that encode the desired similarity semantics.

Ranked #1 on Document Classification on SciDocs (MeSH)

Citation Prediction Contrastive Learning +3

Paper
Code

Generating Scientific Claims for Zero-Shot Scientific Fact Checking

1 code implementation • ACL 2022 • Dustin Wright, David Wadden, Kyle Lo, Bailey Kuehl, Arman Cohan, Isabelle Augenstein, Lucy Lu Wang

To address this challenge, we propose scientific claim generation, the task of generating one or more atomic and verifiable claims from scientific sentences, and demonstrate its usefulness in zero-shot fact checking for biomedical claims.

Fact Checking Negation

Paper
Code

Probing Pre-Trained Language Models for Cross-Cultural Differences in Values

1 code implementation • 25 Mar 2022 • Arnav Arora, Lucie-Aimée Kaffee, Isabelle Augenstein

In this paper, we introduce probes to study which values across cultures are embedded in these models, and whether they align with existing theories and cross-cultural value surveys.

Paper
Code

Fact Checking with Insufficient Evidence

no code implementations • 5 Apr 2022 • Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

To this end, we are the first to study what information FC models consider sufficient by introducing a novel task and advancing it with three main contributions.

Data Augmentation Fact Checking +2

Paper
Add Code

Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models

1 code implementation • NAACL 2022 • Karolina Stańczak, Edoardo Ponti, Lucas Torroba Hennigen, Ryan Cotterell, Isabelle Augenstein

The success of multilingual pre-trained models is underpinned by their ability to learn representations shared by multiple languages even in absence of any explicit supervision.

Paper
Code

Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection

no code implementations • NAACL 2022 • Indira Sen, Mattia Samory, Claudia Wagner, Isabelle Augenstein

Especially, construct-driven CAD -- perturbations of core features -- may induce models to ignore the context in which core features are used.

Hate Speech Detection

Paper
Add Code

Machine Reading, Fast and Slow: When Do Models "Understand" Language?

no code implementations • 15 Sep 2022 • Sagnik Ray Choudhury, Anna Rogers, Isabelle Augenstein

Two of the most fundamental challenges in Natural Language Understanding (NLU) at present are: (a) how to establish whether deep learning-based models score highly on NLU benchmarks for the 'right' reasons; and (b) to understand what those reasons would even be.

coreference-resolution counterfactual +2

Paper
Add Code

Modeling Information Change in Science Communication with Semantically Matched Paraphrases

no code implementations • 24 Oct 2022 • Dustin Wright, Jiaxin Pei, David Jurgens, Isabelle Augenstein

Whether the media faithfully communicate scientific information has long been a core issue to the science community.

Fact Checking Retrieval

Paper
Add Code

Revisiting Softmax for Uncertainty Approximation in Text Classification

no code implementations • 25 Oct 2022 • Andreas Nugaard Holm, Dustin Wright, Isabelle Augenstein

A cheaper alternative is to simply use the softmax based on a single forward pass without dropout to estimate model uncertainty.

Domain Adaptation text-classification +1

Paper
Add Code

Multi-View Knowledge Distillation from Crowd Annotations for Out-of-Domain Generalization

no code implementations • 19 Dec 2022 • Dustin Wright, Isabelle Augenstein

Selecting an effective training signal for tasks in natural language processing is difficult: expert annotations are expensive, and crowd-sourced annotations may not be reliable.

Domain Generalization Knowledge Distillation

Paper
Add Code

TempEL: Linking Dynamically Evolving and Newly Emerging Entities

1 code implementation • 5 Feb 2023 • Klim Zaporojets, Lucie-Aimee Kaffee, Johannes Deleu, Thomas Demeester, Chris Develder, Isabelle Augenstein

For that study, we introduce TempEL, an entity linking dataset that consists of time-stratified English Wikipedia snapshots from 2013 to 2022, from which we collect both anchor mentions of entities, and these target entities' descriptions.

Entity Disambiguation Entity Linking

Paper
Code

Measuring Gender Bias in West Slavic Language Models

no code implementations • 12 Apr 2023 • Sandra Martinková, Karolina Stańczak, Isabelle Augenstein

Perhaps surprisingly, Czech, Slovak, and Polish language models produce more hurtful completions with men as subjects, which, upon inspection, we find is due to completions being related to violence, death, and sickness.

Language Modelling

Paper
Add Code

Thorny Roses: Investigating the Dual Use Dilemma in Natural Language Processing

1 code implementation • 17 Apr 2023 • Lucie-Aimée Kaffee, Arnav Arora, Zeerak Talat, Isabelle Augenstein

Dual use, the intentional, harmful reuse of technology and scientific artefacts, is a problem yet to be well-defined within the context of Natural Language Processing (NLP).

Ethics

Paper
Code

Multilingual Event Extraction from Historical Newspaper Adverts

1 code implementation • 18 May 2023 • Nadav Borenstein, Natalia da Silva Perez, Isabelle Augenstein

We find that: 1) even with scarce annotated data, it is possible to achieve surprisingly good results by formulating the problem as an extractive QA task and leveraging existing datasets and models for modern languages; and 2) cross-lingual low-resource learning for historical languages is highly challenging, and machine translation of the historical datasets to the considered target languages is, in practice, often the best-performing solution.

Event Extraction Machine Translation

Paper
Code

Measuring Intersectional Biases in Historical Documents

1 code implementation • 21 May 2023 • Nadav Borenstein, Karolina Stańczak, Thea Rolskov, Natália da Silva Perez, Natacha Klein Käfer, Isabelle Augenstein

We find that there is a trade-off between the stability of the word embeddings and their compatibility with the historical dataset.

Optical Character Recognition Optical Character Recognition (OCR) +1

Paper
Code

Faithfulness Tests for Natural Language Explanations

1 code implementation • 29 May 2023 • Pepa Atanasova, Oana-Maria Camburu, Christina Lioma, Thomas Lukasiewicz, Jakob Grue Simonsen, Isabelle Augenstein

Explanations of neural models aim to reveal a model's decision-making process for its predictions.

counterfactual Decision Making

Paper
Code

Topic-Guided Sampling For Data-Efficient Multi-Domain Stance Detection

2 code implementations • 1 Jun 2023 • Erik Arakelyan, Arnav Arora, Isabelle Augenstein

The results show that our method outperforms the state-of-the-art with an average of $3. 5$ F1 points increase in-domain, and is more generalizable with an averaged increase of $10. 2$ F1 on out-of-domain evaluation while using $\leq10\%$ of the training data.

Ranked #1 on Stance Detection on mtsd

Contrastive Learning Domain Adaptation +1

Paper
Code

Factuality Challenges in the Era of Large Language Models

no code implementations • 8 Oct 2023 • Isabelle Augenstein, Timothy Baldwin, Meeyoung Cha, Tanmoy Chakraborty, Giovanni Luca Ciampaglia, David Corney, Renee DiResta, Emilio Ferrara, Scott Hale, Alon Halevy, Eduard Hovy, Heng Ji, Filippo Menczer, Ruben Miguez, Preslav Nakov, Dietram Scheufele, Shivam Sharma, Giovanni Zagni

The emergence of tools based on Large Language Models (LLMs), such as OpenAI's ChatGPT, Microsoft's Bing Chat, and Google's Bard, has garnered immense public attention.

Text Generation

Paper
Add Code

Why Should This Article Be Deleted? Transparent Stance Detection in Multilingual Wikipedia Editor Discussions

1 code implementation • 9 Oct 2023 • Lucie-Aimée Kaffee, Arnav Arora, Isabelle Augenstein

The moderation of content on online platforms is usually non-transparent.

Decision Making Stance Detection

Paper
Code

Explaining Interactions Between Text Spans

1 code implementation • 20 Oct 2023 • Sagnik Ray Choudhury, Pepa Atanasova, Isabelle Augenstein

Reasoning over spans of tokens from different parts of the input is essential for natural language understanding (NLU) tasks such as fact-checking (FC), machine reading comprehension (MRC) or natural language inference (NLI).

Community Detection Decision Making +6

Paper
Code

PHD: Pixel-Based Language Modeling of Historical Documents

1 code implementation • 22 Oct 2023 • Nadav Borenstein, Phillip Rust, Desmond Elliott, Isabelle Augenstein

We then pre-train our model, PHD, on a combination of synthetic scans and real historical newspapers from the 1700-1900 period.

Language Modelling Optical Character Recognition (OCR)

Paper
Code

People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection

1 code implementation • 2 Nov 2023 • Indira Sen, Dennis Assenmacher, Mattia Samory, Isabelle Augenstein, Wil van der Aalst, Claudia Wagner

CADs introduce minimal changes to existing training data points and flip their labels; training on them may reduce model dependency on spurious features.

Data Augmentation

Paper
Code

Social Bias Probing: Fairness Benchmarking for Language Models

no code implementations • 15 Nov 2023 • Marta Marchiori Manerba, Karolina Stańczak, Riccardo Guidotti, Isabelle Augenstein

While the impact of these biases has been recognized, prior methods for bias evaluation have been limited to binary association tests on small datasets, offering a constrained view of the nature of societal biases within language models.

Benchmarking Fairness +1

Paper
Add Code

Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers

1 code implementation • 15 Nov 2023 • Yuxia Wang, Revanth Gangi Reddy, Zain Muhammad Mujahid, Arnav Arora, Aleksandr Rubashevskii, Jiahui Geng, Osama Mohammed Afzal, Liangming Pan, Nadav Borenstein, Aditya Pillai, Isabelle Augenstein, Iryna Gurevych, Preslav Nakov

The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs.

Fact Checking Sentence

Paper
Code

Grammatical Gender's Influence on Distributional Semantics: A Causal Perspective

no code implementations • 30 Nov 2023 • Karolina Stańczak, Kevin Du, Adina Williams, Isabelle Augenstein, Ryan Cotterell

However, when we control for the meaning of the noun, we find that grammatical gender has a near-zero effect on adjective choice, thereby calling the neo-Whorfian hypothesis into question.

Paper
Add Code

Semantic Sensitivities and Inconsistent Predictions: Measuring the Fragility of NLI Models

no code implementations • 25 Jan 2024 • Erik Arakelyan, Zhaoqi Liu, Isabelle Augenstein

We systematically study the effects of the phenomenon across NLI models for $\textbf{in-}$ and $\textbf{out-of-}$ domain settings.

Conditional Text Generation Natural Language Inference +1

Paper
Add Code

Understanding Fine-grained Distortions in Reports of Scientific Findings

no code implementations • 19 Feb 2024 • Amelie Wührl, Dustin Wright, Roman Klinger, Isabelle Augenstein

Distorted science communication harms individuals and society as it can lead to unhealthy behavior change and decrease trust in scientific institutions.

Paper
Add Code

Investigating the Impact of Model Instability on Explanations and Uncertainty

no code implementations • 20 Feb 2024 • Sara Vera Marjanović, Isabelle Augenstein, Christina Lioma

In this large-scale empirical study, we insert different levels of noise perturbations and measure the effect on the output of pre-trained language models and different uncertainty metrics.

Paper
Add Code

Can Edge Probing Tests Reveal Linguistic Knowledge in QA Models?

no code implementations • COLING 2022 • Sagnik Ray Choudhury, Nikita Bhutani, Isabelle Augenstein

We find that EP test results do not change significantly when the fine-tuned model performs well or in adversarial situations where the model is forced to learn wrong correlations.

Question Answering

Paper
Add Code

Machine Reading, Fast and Slow: When Do Models “Understand” Language?

no code implementations • COLING 2022 • Sagnik Ray Choudhury, Anna Rogers, Isabelle Augenstein

Two of the most fundamental issues in Natural Language Understanding (NLU) at present are: (a) how it can established whether deep learning-based models score highly on NLU benchmarks for the ”right” reasons; and (b) what those reasons would even be.

coreference-resolution counterfactual +2

Paper
Add Code

Multi3Generation: Multitask, Multilingual, Multimodal Language Generation

no code implementations • EAMT 2022 • Anabela Barreiro, José GC de Souza, Albert Gatt, Mehul Bhatt, Elena Lloret, Aykut Erdem, Dimitra Gkatzia, Helena Moniz, Irene Russo, Fabio Kepler, Iacer Calixto, Marcin Paprzycki, François Portet, Isabelle Augenstein, Mirela Alhasani

This paper presents the Multitask, Multilingual, Multimodal Language Generation COST Action – Multi3Generation (CA18231), an interdisciplinary network of research groups working on different aspects of language generation.

Text Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.