Search Results for author: Kalina Bontcheva

Found 66 papers, 15 papers with code

EUvsDisinfo: a Dataset for Multilingual Detection of Pro-Kremlin Disinformation in News Articles

1 code implementation18 Jun 2024 João A. Leite, Olesya Razuvayevskaya, Kalina Bontcheva, Carolina Scarton

Lastly, we demonstrate the dataset's applicability in training models to effectively distinguish between disinformation and trustworthy content in multilingual settings.

Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling

1 code implementation1 May 2024 Yida Mu, Peizhen Bai, Kalina Bontcheva, Xingyi Song

In this paper, we focus on addressing the issues of topic granularity and hallucinations for better LLM-based topic modelling.

Hallucination Topic Classification

Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling

no code implementations24 Mar 2024 Yida Mu, Chun Dong, Kalina Bontcheva, Xingyi Song

Topic modelling, as a well-established unsupervised technique, has found extensive use in automatically detecting significant topics within a corpus of documents.

Lying Blindly: Bypassing ChatGPT's Safeguards to Generate Hard-to-Detect Disinformation Claims at Scale

no code implementations13 Feb 2024 Freddy Heppell, Mehmet E. Bakir, Kalina Bontcheva

As Large Language Models (LLMs) become more proficient, their misuse in large-scale viral disinformation campaigns is a growing concern.

Don't Waste a Single Annotation: Improving Single-Label Classifiers Through Soft Labels

no code implementations9 Nov 2023 Ben Wu, Yue Li, Yida Mu, Carolina Scarton, Kalina Bontcheva, Xingyi Song

In this paper, we address the limitations of the common data annotation and training methods for objective single-label classification tasks.

Analysing State-Backed Propaganda Websites: a New Dataset and Linguistic Study

1 code implementation21 Oct 2023 Freddy Heppell, Kalina Bontcheva, Carolina Scarton

This paper analyses two hitherto unstudied sites sharing state-backed disinformation, Reliable Recent News (rrn. world) and WarOnFakes (waronfakes. com), which publish content in Arabic, Chinese, English, French, German, and Spanish.

Examining Temporal Bias in Abusive Language Detection

no code implementations25 Sep 2023 Mali Jin, Yida Mu, Diana Maynard, Kalina Bontcheva

The use of abusive language online has become an increasingly pervasive problem that damages both individuals and society, with effects ranging from psychological harm right through to escalation to real-life violence and even death.

Abusive Language

Examining the Limitations of Computational Rumor Detection Models Trained on Static Datasets

no code implementations20 Sep 2023 Yida Mu, Xingyi Song, Kalina Bontcheva, Nikolaos Aletras

A crucial aspect of a rumor detection model is its ability to generalize, particularly its ability to detect emerging, previously unknown rumors.

Detecting Misinformation with LLM-Predicted Credibility Signals and Weak Supervision

no code implementations14 Sep 2023 João A. Leite, Olesya Razuvayevskaya, Kalina Bontcheva, Carolina Scarton

Credibility signals represent a wide range of heuristics that are typically used by journalists and fact-checkers to assess the veracity of online content.


Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science

no code implementations23 May 2023 Yida Mu, Ben P. Wu, William Thorne, Ambrose Robinson, Nikolaos Aletras, Carolina Scarton, Kalina Bontcheva, Xingyi Song

Instruction-tuned Large Language Models (LLMs) have exhibited impressive language understanding and the capacity to generate responses that follow specific prompts.

Zero-Shot Learning

Examining Temporalities on Stance Detection towards COVID-19 Vaccination

no code implementations10 Apr 2023 Yida Mu, Mali Jin, Kalina Bontcheva, Xingyi Song

It is crucial for policymakers to have a comprehensive understanding of the public's stance towards vaccination on a large scale.

Stance Classification Stance Detection

A Large-Scale Comparative Study of Accurate COVID-19 Information versus Misinformation

no code implementations10 Apr 2023 Yida Mu, Ye Jiang, Freddy Heppell, Iknoor Singh, Carolina Scarton, Kalina Bontcheva, Xingyi Song

This motivated us to carry out a comparative study of the characteristics of COVID-19 misinformation versus those of accurate COVID-19 information through a large-scale computational analysis of over 242 million tweets.


SheffieldVeraAI at SemEval-2023 Task 3: Mono and multilingual approaches for news genre, topic and persuasion technique classification

1 code implementation16 Mar 2023 Ben Wu, Olesya Razuvayevskaya, Freddy Heppell, João A. Leite, Carolina Scarton, Kalina Bontcheva, Xingyi Song

For Subtask 2 (Framing), we achieved first place in 3 languages, and the best average rank across all the languages, by using two separate ensembles: a monolingual RoBERTa-MUPPETLARGE and an ensemble of XLM-RoBERTaLARGE with adapters and task adaptive pretraining.

VaxxHesitancy: A Dataset for Studying Hesitancy towards COVID-19 Vaccination on Twitter

1 code implementation17 Jan 2023 Yida Mu, Mali Jin, Charlie Grimshaw, Carolina Scarton, Kalina Bontcheva, Xingyi Song

Annotated data is also necessary for training data-driven models for more nuanced analysis of attitudes towards vaccination.

Language Modelling

On the Impact of Temporal Concept Drift on Model Explanations

1 code implementation17 Oct 2022 Zhixue Zhao, George Chrysostomou, Kalina Bontcheva, Nikolaos Aletras

Explanation faithfulness of model predictions in natural language processing is typically evaluated on held-out data from the same temporal distribution as the training data (i. e. synchronous settings).

Text Classification

Classifying COVID-19 vaccine narratives

no code implementations18 Jul 2022 Yue Li, Carolina Scarton, Xingyi Song, Kalina Bontcheva

This paper addresses the need for monitoring and analysing vaccine narratives online by introducing a novel vaccine narrative classification task, which categorises COVID-19 vaccine claims into one of seven categories.

Data Augmentation

Categorising Fine-to-Coarse Grained Misinformation: An Empirical Study of COVID-19 Infodemic

no code implementations22 Jun 2021 Ye Jiang, Xingyi Song, Carolina Scarton, Ahmet Aker, Kalina Bontcheva

In this paper, we introduce a fine-grained annotated misinformation tweets dataset including social behaviours annotation (e. g. comment or question to the misinformation).


MP Twitter Engagement and Abuse Post-first COVID-19 Lockdown in the UK: White Paper

no code implementations4 Mar 2021 Tracie Farrell, Mehmet Bakir, Kalina Bontcheva

This work covers the period of June to December 2020 and analyses Twitter abuse in replies to UK MPs.

Multistage BiCross encoder for multilingual access to COVID-19 health information

1 code implementation8 Jan 2021 Iknoor Singh, Carolina Scarton, Kalina Bontcheva

The Coronavirus (COVID-19) pandemic has led to a rapidly growing 'infodemic' of health information online.


Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus

no code implementations5 Jun 2020 Xingyi Song, Johann Petrak, Ye Jiang, Iknoor Singh, Diana Maynard, Kalina Bontcheva

The explosion of disinformation accompanying the COVID-19 pandemic has overloaded fact-checkers and media worldwide, and brought a new major challenge to government responses worldwide.

Fact Checking General Classification

Using Deep Neural Networks with Intra- and Inter-Sentence Context to Classify Suicidal Behaviour

no code implementations LREC 2020 Xingyi Song, Johnny Downs, Sumithra Velupillai, Rachel Holden, Maxim Kikoler, Kalina Bontcheva, Rina Dutta, Angus Roberts

Identifying statements related to suicidal behaviour in psychiatric electronic health records (EHRs) is an important step when modeling that behaviour, and when assessing suicide risk.

Classification General Classification +1

Journalist-in-the-Loop: Continuous Learning as a Service for Rumour Analysis

no code implementations IJCNLP 2019 Twin Karmakharm, Nikolaos Aletras, Kalina Bontcheva

The system features a rumour annotation service that allows journalists to easily provide feedback for a given social media post through a web-based interface.

Rumour Detection

The evolution of argumentation mining: From models to social media and emerging tools

no code implementations4 Jul 2019 Anastasios Lytos, Thomas Lagkas, Panagiotis Sarigiannidis, Kalina Bontcheva

In this survey article, we bridge the gap between theoretical approaches of argumentation mining and pragmatic schemes that satisfy the needs of social media generated data, recognizing the need for adapting more flexible and expandable schemes, capable to adjust to the argumentation conditions that exist in social media.

Helping Crisis Responders Find the Informative Needle in the Tweet Haystack

1 code implementation29 Jan 2018 Leon Derczynski, Kenny Meesters, Kalina Bontcheva, Diana Maynard

Messages are filtered for informativeness based on a definition of the concept drawn from prior research and crisis response experts.

General Classification Informativeness

Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers

no code implementations6 Dec 2017 Arkaitz Zubiaga, Elena Kochkina, Maria Liakata, Rob Procter, Michal Lukasik, Kalina Bontcheva, Trevor Cohn, Isabelle Augenstein

We show that sequential classifiers that exploit the use of discourse properties in social media conversations while using only local features, outperform non-sequential classifiers.

General Classification Stance Classification

Automatic Summarization of Online Debates

no code implementations RANLP 2017 Nattapong Sanchan, Ahmet Aker, Kalina Bontcheva

In our work, we investigate two different clustering approaches for the generation of the summaries.

Clustering Text Summarization

Detection and Resolution of Rumours in Social Media: A Survey

no code implementations3 Apr 2017 Arkaitz Zubiaga, Ahmet Aker, Kalina Bontcheva, Maria Liakata, Rob Procter

Despite the increasing use of social media platforms for information and news gathering, its unmoderated nature often leads to the emergence and spread of rumours, i. e. pieces of information that are unverified at the time of posting.

Classification General Classification +3

Generalisation in Named Entity Recognition: A Quantitative Analysis

no code implementations11 Jan 2017 Isabelle Augenstein, Leon Derczynski, Kalina Bontcheva

Unseen NEs, in particular, play an important role, which have a higher incidence in diverse genres such as social media than in more regular genres such as newswire.

Diversity named-entity-recognition +2

Broad Twitter Corpus: A Diverse Named Entity Recognition Resource

no code implementations COLING 2016 Leon Derczynski, Kalina Bontcheva, Ian Roberts

One of the main obstacles, hampering method development and comparative evaluation of named entity recognition in social media, is the lack of a sizeable, diverse, high quality annotated corpus, analogous to the CoNLL{'}2003 news dataset.

Diversity named-entity-recognition +2

Stance Detection with Bidirectional Conditional Encoding

1 code implementation EMNLP 2016 Isabelle Augenstein, Tim Rocktäschel, Andreas Vlachos, Kalina Bontcheva

Stance detection is the task of classifying the attitude expressed in a text towards a target such as Hillary Clinton to be "positive", negative" or "neutral".

Stance Detection

Monolingual Social Media Datasets for Detecting Contradiction and Entailment

no code implementations LREC 2016 Piroska Lendvai, Isabelle Augenstein, Kalina Bontcheva, Thierry Declerck

Entailment recognition approaches are useful for application domains such as information extraction, question answering or summarisation, for which evidence from multiple sentences needs to be combined.

Natural Language Inference Question Answering +1

Challenges of Evaluating Sentiment Analysis Tools on Social Media

no code implementations LREC 2016 Diana Maynard, Kalina Bontcheva

This paper discusses the challenges in carrying out fair comparative evaluations of sentiment analysis systems.

Sentiment Analysis

USFD: Twitter NER with Drift Compensation and Linked Data

no code implementations WS 2015 Leon Derczynski, Isabelle Augenstein, Kalina Bontcheva

This paper describes a pilot NER system for Twitter, comprising the USFD system entry to the W-NUT 2015 NER shared task.

Clustering NER

Analysis of Named Entity Recognition and Linking for Tweets

no code implementations27 Oct 2014 Leon Derczynski, Diana Maynard, Giuseppe Rizzo, Marieke van Erp, Genevieve Gorrell, Raphaël Troncy, Johann Petrak, Kalina Bontcheva

Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area.

Entity Disambiguation Language Identification +4

Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines

no code implementations LREC 2014 Marta Sabou, Kalina Bontcheva, Leon Derczynski, Arno Scharl

Crowdsourcing is an emerging collaborative approach that can be used for the acquisition of annotated corpora and a wide range of other linguistic resources.

Domain Adaptation Natural Language Inference +3

Cannot find the paper you are looking for? You can Submit a new open access paper.