Search Results for author: Preslav Nakov

Found 226 papers, 73 papers with code

COVID-19 in Bulgarian Social Media: Factuality, Harmfulness, Propaganda, and Framing

1 code implementation RANLP 2021 Preslav Nakov, Firoj Alam, Shaden Shaar, Giovanni Da San Martino, Yifan Zhang

With the emergence of the COVID-19 pandemic, the political and the medical aspects of disinformation merged as the problem got elevated to a whole new level to become the first global infodemic.

Large Language Models are Few-Shot Training Example Generators: A Case Study in Fallacy Recognition

no code implementations16 Nov 2023 Tariq Alhindi, Smaranda Muresan, Preslav Nakov

In this study, we aim to enhance existing models for fallacy recognition by incorporating additional context and by leveraging large language models to generate synthetic data, thus increasing the representation of the infrequent classes.

A Survey of Language Model Confidence Estimation and Calibration

no code implementations14 Nov 2023 Jiahui Geng, Fengyu Cai, Yuxia Wang, Heinz Koeppl, Preslav Nakov, Iryna Gurevych

In particular, we discuss methods and techniques for LM confidence estimation and calibration, encompassing different LMs and various tasks.

Language Modelling

A Template Is All You Meme

1 code implementation11 Nov 2023 Luke Bates, Peter Ebert Christensen, Preslav Nakov, Iryna Gurevych

Here, to aid understanding of memes, we release a knowledge base of memes and information found on www. knowyourmeme. com, which we call the Know Your Meme Knowledge Base (KYMKB), composed of more than 54, 000 images.

ArAIEval Shared Task: Persuasion Techniques and Disinformation Detection in Arabic Text

no code implementations6 Nov 2023 Maram Hasanain, Firoj Alam, Hamdy Mubarak, Samir Abdaljalil, Wajdi Zaghouani, Preslav Nakov, Giovanni Da San Martino, Abed Alhakim Freihat

We present an overview of the ArAIEval shared task, organized as part of the first ArabicNLP 2023 conference co-located with EMNLP 2023.

Adapting Fake News Detection to the Era of Large Language Models

no code implementations2 Nov 2023 Jinyan Su, Claire Cardie, Preslav Nakov

With the proliferation of both human-written and machine-generated real and fake news, robustly and effectively discerning the veracity of news articles has become an intricate challenge.

Fake News Detection

Lost in Translation, Found in Spans: Identifying Claims in Multilingual Social Media

1 code implementation27 Oct 2023 Shubham Mittal, Megha Sundriyal, Preslav Nakov

Claim span identification (CSI) is an important step in fact-checking pipelines, aiming to identify text segments that contain a checkworthy claim or assertion in a social media post.

Cross-Lingual Transfer Fact Checking +1

From Chaos to Clarity: Claim Normalization to Empower Fact-Checking

1 code implementation22 Oct 2023 Megha Sundriyal, Tanmoy Chakraborty, Preslav Nakov

To evaluate the effectiveness of our proposed model, we meticulously compile a comprehensive real-world dataset, CLAN, comprising more than 6k instances of social media posts alongside their respective normalized claims.

Fact Checking

QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking

1 code implementation11 Oct 2023 Liangming Pan, Xinyuan Lu, Min-Yen Kan, Preslav Nakov

Fact-checking real-world claims often requires complex, multi-step reasoning due to the absence of direct evidence to support or refute them.

Decision Making Fact Checking +1

Rethinking STS and NLI in Large Language Models

no code implementations16 Sep 2023 Yuxia Wang, Minghan Wang, Preslav Nakov

In this study, we aim to rethink STS and NLI in the era of large language models (LLMs).


Fake News Detectors are Biased against Texts Generated by Large Language Models

no code implementations15 Sep 2023 Jinyan Su, Terry Yue Zhuo, Jonibek Mansurov, Di Wang, Preslav Nakov

The spread of fake news has emerged as a critical challenge, undermining trust and posing threats to society.


Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs

1 code implementation25 Aug 2023 Yuxia Wang, Haonan Li, Xudong Han, Preslav Nakov, Timothy Baldwin

With the rapid evolution of large language models (LLMs), new and hard-to-predict harmful capabilities are emerging.

bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark

2 code implementations4 Jun 2023 Momchil Hardalov, Pepa Atanasova, Todor Mihaylov, Galia Angelova, Kiril Simov, Petya Osenova, Ves Stoyanov, Ivan Koychev, Preslav Nakov, Dragomir Radev

We run the first systematic evaluation of pre-trained language models for Bulgarian, comparing and contrasting results across the nine tasks in the benchmark.

Fact Checking named-entity-recognition +5

Understanding Breast Cancer Survival: Using Causality and Language Models on Multi-omics Data

no code implementations28 May 2023 Mugariya Farooq, Shahad Hardan, Aigerim Zhumbhayeva, Yujia Zheng, Preslav Nakov, Kun Zhang

The need for more usable and explainable machine learning models in healthcare increases the importance of developing and utilizing causal discovery algorithms, which aim to discover causal relations by analyzing observational data.

Causal Discovery

DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text

1 code implementation23 May 2023 Jinyan Su, Terry Yue Zhuo, Di Wang, Preslav Nakov

One is called DetectLLM-LRR, which is fast and efficient, and the other is called DetectLLM-NPR, which is more accurate, but slower due to the need for perturbations.


On the Risk of Misinformation Pollution with Large Language Models

1 code implementation23 May 2023 Yikang Pan, Liangming Pan, Wenhu Chen, Preslav Nakov, Min-Yen Kan, William Yang Wang

In this paper, we comprehensively investigate the potential misuse of modern Large Language Models (LLMs) for generating credible-sounding misinformation and its subsequent impact on information-intensive applications, particularly Open-Domain Question Answering (ODQA) systems.

Misinformation Open-Domain Question Answering

Detecting Propaganda Techniques in Code-Switched Social Media Text

1 code implementation23 May 2023 Muhammad Umar Salman, Asif Hanif, Shady Shehata, Preslav Nakov

Yet, it is common to find a mix of multiple languages in social media communication, a phenomenon known as code-switching.

Propaganda detection

SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables

1 code implementation22 May 2023 Xinyuan Lu, Liangming Pan, Qian Liu, Preslav Nakov, Min-Yen Kan

Current scientific fact-checking benchmarks exhibit several shortcomings, such as biases arising from crowd-sourced claims and an over-reliance on text-based evidence.

Claim Verification Fact Checking

Fact-Checking Complex Claims with Program-Guided Reasoning

1 code implementation22 May 2023 Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min-Yen Kan, Preslav Nakov

Fact-checking real-world claims often requires collecting multiple pieces of evidence and applying complex multi-step reasoning.

Fact Checking

Automated Mapping of CVE Vulnerability Records to MITRE CWE Weaknesses

no code implementations13 Apr 2023 Ashraf Haddad, Najwa Aaraj, Preslav Nakov, Septimiu Fabian Mare

In recent years, a proliferation of cyber-security threats and diversity has been on the rise culminating in an increase in their reporting and analysis.

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

1 code implementation1 Feb 2023 Muhammad Arslan Manzoor, Sarah Albarri, Ziting Xian, Zaiqiao Meng, Preslav Nakov, Shangsong Liang

This survey presents the comprehensive literature on the evolution and enhancement of deep learning multimodal architectures to deal with textual, visual and audio features for diverse cross-modal and modern multimodal tasks.

Question Answering Representation Learning +3

Characterizing the Entities in Harmful Memes: Who is the Hero, the Villain, the Victim?

no code implementations26 Jan 2023 Shivam Sharma, Atharva Kulkarni, Tharun Suresh, Himanshi Mathur, Preslav Nakov, Md. Shad Akhtar, Tanmoy Chakraborty

A common problem associated with meme comprehension lies in detecting the entities referenced and characterizing the role of each of these entities.

Semantic Role Labeling

Temporal Dynamics of Coordinated Online Behavior: Stability, Archetypes, and Influence

no code implementations17 Jan 2023 Serena Tardelli, Leonardo Nizzoli, Maurizio Tesconi, Mauro Conti, Preslav Nakov, Giovanni Da San Martino, Stefano Cresci

Large-scale online campaigns, malicious or otherwise, require a significant degree of coordination among participants, which sparked interest in the study of coordinated online behavior.

Community Detection Dynamic Community Detection

Overview of the WANLP 2022 Shared Task on Propaganda Detection in Arabic

no code implementations18 Nov 2022 Firoj Alam, Hamdy Mubarak, Wajdi Zaghouani, Giovanni Da San Martino, Preslav Nakov

Thus, there has been a lot of recent research on automatic detection of propaganda techniques in text as well as in memes.

Propaganda detection

GREENER: Graph Neural Networks for News Media Profiling

no code implementations10 Nov 2022 Panayot Panayotov, Utsav Shukla, Husrev Taha Sencar, Mohamed Nabeel, Preslav Nakov

We study the problem of profiling news media on the Web with respect to their factuality of reporting and bias.

Fake News Detection

PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training

1 code implementation5 Nov 2022 Zihui Gu, Ju Fan, Nan Tang, Preslav Nakov, Xiaoman Zhao, Xiaoyong Du

In particular, on the complex set of TabFact, which contains multiple operations, PASTA largely outperforms the previous state of the art by 4. 7 points (85. 6% vs. 80. 9%), and the gap between PASTA and human performance on the small TabFact test set is narrowed to just 1. 5 points (90. 6% vs. 92. 1%).

Fact Checking Fact Verification +4

IITD at the WANLP 2022 Shared Task: Multilingual Multi-Granularity Network for Propaganda Detection

1 code implementation31 Oct 2022 Shubham Mittal, Preslav Nakov

In addition to finding the techniques, Subtask 2 further asks to identify the textual span for each instance of each technique that is present in the tweet; the task can be modeled as a sequence tagging problem.

Multi-Label Classification Propaganda detection +1

CrowdChecked: Detecting Previously Fact-Checked Claims in Social Media

1 code implementation10 Oct 2022 Momchil Hardalov, Anton Chernyavskiy, Ivan Koychev, Dmitry Ilvovsky, Preslav Nakov

Thus, an interesting approach has emerged: to perform automatic fact-checking by verifying whether an input claim has been previously fact-checked by professional fact-checkers and to return back an article that explains their decision.

Fact Checking

Ten Years after ImageNet: A 360° Perspective on AI

no code implementations1 Oct 2022 Sanjay Chawla, Preslav Nakov, Ahmed Ali, Wendy Hall, Issa Khalil, Xiaosong Ma, Husrev Taha Sencar, Ingmar Weber, Michael Wooldridge, Ting Yu

The rise of attention networks, self-supervised learning, generative modeling, and graph neural networks has widened the application space of AI.

Decision Making Fairness +1

DISARM: Detecting the Victims Targeted by Harmful Memes

1 code implementation Findings (NAACL) 2022 Shivam Sharma, Md. Shad Akhtar, Preslav Nakov, Tanmoy Chakraborty

Finally, we show that DISARM is interpretable and comparatively more generalizable and that it can reduce the relative error rate for harmful target identification by up to 9 points absolute over several strong multimodal rivals.

Named Entity Recognition Named Entity Recognition (NER) +1

Detecting the Role of an Entity in Harmful Memes: Techniques and Their Limitations

1 code implementation CONSTRAINT (ACL) 2022 Rabindra Nath Nandi, Firoj Alam, Preslav Nakov

The content that is posted and shared online can be textual, visual, or a combination of both, e. g., in a meme.

Detecting and Understanding Harmful Memes: A Survey

1 code implementation9 May 2022 Shivam Sharma, Firoj Alam, Md. Shad Akhtar, Dimitar Dimitrov, Giovanni Da San Martino, Hamed Firooz, Alon Halevy, Fabrizio Silvestri, Preslav Nakov, Tanmoy Chakraborty

One interesting finding is that many types of harmful memes are not really studied, e. g., such featuring self-harm and extremism, partly due to the lack of suitable datasets.

TeamX@DravidianLangTech-ACL2022: A Comparative Analysis for Troll-Based Meme Classification

no code implementations DravidianLangTech (ACL) 2022 Rabindra Nath Nandi, Firoj Alam, Preslav Nakov

The spread of fake news, propaganda, misinformation, disinformation, and harmful content online raised concerns among social media platforms, government agencies, policymakers, and society as a whole.

Meme Classification Misinformation

Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation

1 code implementation10 Mar 2022 Kung-Hsiang Huang, Kathleen McKeown, Preslav Nakov, Yejin Choi, Heng Ji

Despite recent advances in detecting fake news generated by neural models, their results are not readily applicable to effective detection of human-written disinformation.

Fake News Detection Natural Language Inference +1

QCRI's COVID-19 Disinformation Detector: A System to Fight the COVID-19 Infodemic in Social Media

no code implementations8 Mar 2022 Preslav Nakov, Firoj Alam, Yifan Zhang, Animesh Prakash, Fahim Dalvi

Fighting the ongoing COVID-19 infodemic has been declared as one of the most important focus areas by the World Health Organization since the onset of the COVID-19 pandemic.

Leaf: Multiple-Choice Question Generation

1 code implementation22 Jan 2022 Kristiyan Vachev, Momchil Hardalov, Georgi Karadzhov, Georgi Georgiev, Ivan Koychev, Preslav Nakov

Testing with quiz questions has proven to be an effective way to assess and improve the educational process.

Multiple-choice Question Answering +2

Batch-Softmax Contrastive Loss for Pairwise Sentence Scoring Tasks

no code implementations NAACL 2022 Anton Chernyavskiy, Dmitry Ilvovsky, Pavel Kalinin, Preslav Nakov

The use of contrastive loss for representation learning has become prominent in computer vision, and it is now getting attention in Natural Language Processing (NLP).

Sentence Embeddings

The Spread of Propaganda by Coordinated Communities on Social Media

no code implementations27 Sep 2021 Kristina Hristakieva, Stefano Cresci, Giovanni Da San Martino, Mauro Conti, Preslav Nakov

Large-scale manipulations on social media have two important characteristics: (i) use of propaganda to influence others, and (ii) adoption of coordinated behavior to spread it and to amplify its impact.

Analyzing the Use of Character-Level Translation with Sparse and Noisy Datasets

no code implementations RANLP 2013 Jörg Tiedemann, Preslav Nakov

This paper provides an analysis of character-level machine translation models used in pivot-based translation when applied to sparse and noisy datasets, such as crowdsourced movie subtitles.

Machine Translation Translation

Feature-Rich Named Entity Recognition for Bulgarian Using Conditional Random Fields

no code implementations26 Sep 2021 Georgi Georgiev, Preslav Nakov, Kuzman Ganchev, Petya Osenova, Kiril Ivanov Simov

The paper presents a feature-rich approach to the automatic recognition and categorization of named entities (persons, organizations, locations, and miscellaneous) in news text for Bulgarian.

Miscellaneous named-entity-recognition +2

Improved statistical machine translation using monolingual paraphrases

no code implementations25 Sep 2021 Preslav Nakov

We propose a novel monolingual sentence paraphrasing method for augmenting the training data for statistical machine translation systems "for free" -- by creating it from data that is already available rather than having to create more aligned data.

Machine Translation Translation

RuleBert: Teaching Soft Rules to Pre-trained Language Models

1 code implementation EMNLP 2021 Mohammed Saeed, Naser Ahmadi, Preslav Nakov, Paolo Papotti

While pre-trained language models (PLMs) are the go-to solution to tackle many natural language processing problems, they are still very limited in their ability to capture and to use common-sense knowledge.

Common Sense Reasoning

Detecting Harmful Memes and Their Targets

no code implementations Findings (ACL) 2021 Shraman Pramanick, Dimitar Dimitrov, Rituparna Mukherjee, Shivam Sharma, Md. Shad Akhtar, Preslav Nakov, Tanmoy Chakraborty

In this work, we propose two novel problem formulations: detecting harmful memes and the social entities that these harmful memes target.

A Second Pandemic? Analysis of Fake News About COVID-19 Vaccines in Qatar

no code implementations RANLP 2021 Preslav Nakov, Firoj Alam, Shaden Shaar, Giovanni Da San Martino, Yifan Zhang

While COVID-19 vaccines are finally becoming widely available, a second pandemic that revolves around the circulation of anti-vaxxer fake news may hinder efforts to recover from the first one.

Assisting the Human Fact-Checkers: Detecting All Previously Fact-Checked Claims in a Document

1 code implementation14 Sep 2021 Shaden Shaar, Nikola Georgiev, Firoj Alam, Giovanni Da San Martino, Aisha Mohamed, Preslav Nakov

The output is a re-ranked list of the document sentences, so that those that can be verified are ranked as high as possible, together with corresponding evidence.

Fact Checking Learning-To-Rank +2

Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training

1 code implementation13 Sep 2021 Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

Most research in stance detection, however, has been limited to working with a single language and on a few limited targets, with little work on cross-lingual stance detection.

Stance Detection

Predicting the Factuality of Reporting of News Media Using Observations About User Attention in Their YouTube Channels

no code implementations RANLP 2021 Krasimira Bozhanova, Yoan Dinkov, Ivan Koychev, Maria Castaldo, Tommaso Venturini, Preslav Nakov

We propose a novel framework for predicting the factuality of reporting of news media outlets by studying the user attention cycles in their YouTube channels.

Detecting Propaganda Techniques in Memes

1 code implementation ACL 2021 Dimitar Dimitrov, Bishr Bin Ali, Shaden Shaar, Firoj Alam, Fabrizio Silvestri, Hamed Firooz, Preslav Nakov, Giovanni Da San Martino

We further create and release a new corpus of 950 memes, carefully annotated with 22 propaganda techniques, which can appear in the text, in the image, or in both.

AraStance: A Multi-Country and Multi-Domain Dataset of Arabic Stance Detection for Fact Checking

1 code implementation NAACL (NLP4IF) 2021 Tariq Alhindi, Amal Alabdulkarim, Ali Alshehri, Muhammad Abdul-Mageed, Preslav Nakov

With the continuing spread of misinformation and disinformation online, it is of increasing importance to develop combating mechanisms at scale in the form of automated systems that support multiple languages.

Fact Checking Misinformation +1

SemEval-2021 Task 6: Detection of Persuasion Techniques in Texts and Images

1 code implementation SEMEVAL 2021 Dimitar Dimitrov, Bishr Bin Ali, Shaden Shaar, Firoj Alam, Fabrizio Silvestri, Hamed Firooz, Preslav Nakov, Giovanni Da San Martino

We describe SemEval-2021 task 6 on Detection of Persuasion Techniques in Texts and Images: the data, the annotation guidelines, the evaluation setup, the results, and the participating systems.

Cross-Domain Label-Adaptive Stance Detection

1 code implementation EMNLP 2021 Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

In this paper, we perform an in-depth analysis of 16 stance detection datasets, and we explore the possibility for cross-domain learning from them.

Domain Adaptation Stance Detection

Transformers: "The End of History" for NLP?

no code implementations9 Apr 2021 Anton Chernyavskiy, Dmitry Ilvovsky, Preslav Nakov

Recent advances in neural architectures, such as the Transformer, coupled with the emergence of large-scale pre-trained models such as BERT, have revolutionized the field of Natural Language Processing (NLP), pushing the state of the art for a number of NLP tasks.

A Survey on Predicting the Factuality and the Bias of News Media

no code implementations16 Mar 2021 Preslav Nakov, Husrev Taha Sencar, Jisun An, Haewoon Kwak

The present level of proliferation of fake, biased, and propagandistic content online has made it impossible to fact-check every single suspicious claim or article, either manually or automatically.

Bias Detection Fact Checking +1

Automated Fact-Checking for Assisting Human Fact-Checkers

no code implementations13 Mar 2021 Preslav Nakov, David Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barrón-Cedeño, Paolo Papotti, Shaden Shaar, Giovanni Da San Martino

The reporting and the analysis of current events around the globe has expanded from professional, editor-lead journalism all the way to citizen journalism.

Fact Checking

A Survey on Stance Detection for Mis- and Disinformation Identification

no code implementations Findings (NAACL) 2022 Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

Understanding attitudes expressed in texts, also known as stance detection, plays an important role in systems for detecting false information online, be it misinformation (unintentionally false) or disinformation (intentionally false information).

Fact Checking Misinformation +3

Detecting Harmful Content On Online Platforms: What Platforms Need Vs. Where Research Efforts Go

no code implementations27 Feb 2021 Arnav Arora, Preslav Nakov, Momchil Hardalov, Sheikh Muhammad Sarwar, Vibha Nayak, Yoan Dinkov, Dimitrina Zlatkova, Kyle Dent, Ameya Bhatawdekar, Guillaume Bouchard, Isabelle Augenstein

The proliferation of harmful content on online platforms is a major societal problem, which comes in many different forms including hate speech, offensive language, bullying and harassment, misinformation, spam, violence, graphic content, sexual abuse, self harm, and many other.

Abusive Language Misinformation

EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering

2 code implementations EMNLP 2020 Momchil Hardalov, Todor Mihaylov, Dimitrina Zlatkova, Yoan Dinkov, Ivan Koychev, Preslav Nakov

We perform various experiments with existing top-performing multilingual pre-trained models and we show that EXAMS offers multiple challenges that require multilingual knowledge and reasoning in multiple domains.

Question Answering Transfer Learning

Fact-Checking, Fake News, Propaganda, and Media Bias: Truth Seeking in the Post-Truth Era

no code implementations EMNLP 2020 Preslav Nakov, Giovanni Da San Martino

The rise of social media has democratized content creation and has made it easy for everybody to share and spread information online.

Fact Checking Misinformation

Team Alex at CLEF CheckThat! 2020: Identifying Check-Worthy Tweets With Transformer Models

3 code implementations7 Sep 2020 Alex Nikolov, Giovanni Da San Martino, Ivan Koychev, Preslav Nakov

While misinformation and disinformation have been thriving in social media for years, with the emergence of the COVID-19 pandemic, the political and the health misinformation merged, thus elevating the problem to a whole new level and giving rise to the first global infodemic.

Fact Checking Misinformation

FANG: Leveraging Social Context for Fake News Detection Using Graph Representation

1 code implementation18 Aug 2020 Van-Hoang Nguyen, Kazunari Sugiyama, Preslav Nakov, Min-Yen Kan

In particular, FANG yields significant improvements for the task of fake news detection, and it is robust in the case of limited training data.

Fake News Detection Representation Learning

Can We Spot the "Fake News" Before It Was Even Written?

no code implementations10 Aug 2020 Preslav Nakov

Given the recent proliferation of disinformation online, there has been also growing research interest in automatically debunking rumors, false claims, and "fake news."

Fact Checking

On a Novel Application of Wasserstein-Procrustes for Unsupervised Cross-Lingual Learning

1 code implementation18 Jul 2020 Guillem Ramírez, Rumen Dangovski, Preslav Nakov, Marin Soljačić

We believe that our rethinking of the Wasserstein-Procrustes problem could enable further research, thus helping to develop better algorithms for aligning word embeddings across languages.

Word Embeddings

A Survey on Computational Propaganda Detection

no code implementations15 Jul 2020 Giovanni Da San Martino, Stefano Cresci, Alberto Barron-Cedeno, Seunghak Yu, Roberto Di Pietro, Preslav Nakov

Propaganda campaigns aim at influencing people's mindset with the purpose of advancing a specific agenda.

Propaganda detection

Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms

1 code implementation15 Jul 2020 Firoj Alam, Fahim Dalvi, Shaden Shaar, Nadir Durrani, Hamdy Mubarak, Alex Nikolov, Giovanni Da San Martino, Ahmed Abdelali, Hassan Sajjad, Kareem Darwish, Preslav Nakov

With the outbreak of the COVID-19 pandemic, people turned to social media to read and to share timely information including statistics, warnings, advice, and inspirational stories.


Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media

3 code implementations15 Jul 2020 Alberto Barron-Cedeno, Tamer Elsayed, Preslav Nakov, Giovanni Da San Martino, Maram Hasanain, Reem Suwaileh, Fatima Haouari, Nikolay Babulkov, Bayan Hamdan, Alex Nikolov, Shaden Shaar, Zien Sheikh Ali

The first four tasks compose the full pipeline of claim verification in social media: Task 1 on check-worthiness estimation, Task 2 on retrieving previously fact-checked claims, Task 3 on evidence retrieval, and Task 4 on claim verification.

Claim Verification Retrieval

Predicting the Topical Stance and Political Leaning of Media using Tweets

no code implementations ACL 2020 Peter Stefanov, Kareem Darwish, Atanas Atanasov, Preslav Nakov

Discovering the stances of media outlets and influential people on current, debatable topics is important for social statisticians and policy makers.

Enriched Pre-trained Transformers for Joint Slot Filling and Intent Detection

no code implementations30 Apr 2020 Momchil Hardalov, Ivan Koychev, Preslav Nakov

Recently, the advances in pre-trained language models, namely contextualized models such as ELMo and BERT have revolutionized the field by tapping the potential of training very large models with just a few steps of fine-tuning on a task-specific dataset.

Intent Detection Natural Language Understanding +2

SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification

no code implementations Findings (ACL) 2021 Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Marcos Zampieri, Preslav Nakov

The widespread use of offensive content in social media has led to an abundance of research in detecting language such as hate speech, cyberbullying, and cyber-aggression.

Language Identification

On the Effect of Dropping Layers of Pre-trained Transformer Models

4 code implementations8 Apr 2020 Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov

Transformer-based NLP models are trained using hundreds of millions or even billions of parameters, limiting their applicability in computationally constrained environments.

Knowledge Distillation Sentence Similarity

Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

no code implementations27 Feb 2020 Prakhar Ganesh, Yao Chen, Xin Lou, Mohammad Ali Khan, Yin Yang, Hassan Sajjad, Preslav Nakov, Deming Chen, Marianne Winslett

Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks.

Model Compression

A Context-Aware Approach for Detecting Check-Worthy Claims in Political Debates

no code implementations14 Dec 2019 Pepa Gencheva, Ivan Koychev, Lluís Màrquez, Alberto Barrón-Cedeño, Preslav Nakov

In the context of investigative journalism, we address the problem of automatically identifying which claims in a given document are most worthy and should be prioritized for fact-checking.

Fact Checking

SemEval-2013 Task 2: Sentiment Analysis in Twitter

no code implementations SEMEVAL 2013 Preslav Nakov, Zornitsa Kozareva, Alan Ritter, Sara Rosenthal, Veselin Stoyanov, Theresa Wilson

To address this issue, we have proposed SemEval-2013 Task 2: Sentiment Analysis in Twitter, which included two subtasks: A, an expression-level subtask, and B, a message-level subtask.

Sentiment Analysis Test

Proppy: A System to Unmask Propaganda in Online News

no code implementations14 Dec 2019 Alberto Barrón-Cedeño, Giovanni Da San Martino, Israa Jaradat, Preslav Nakov

We present proppy, the first publicly available real-world, real-time propaganda detection system for online news, which aims at raising awareness, thus potentially limiting the impact of propaganda and helping fight disinformation.

Propaganda detection

Machine Translation Evaluation Meets Community Question Answering

no code implementations ACL 2016 Francisco Guzmán, Lluís Màrquez, Preslav Nakov

We explore the applicability of machine translation evaluation (MTE) methods to a very different problem: answer ranking in community Question Answering.

Community Question Answering Machine Translation +1

Pairwise Neural Machine Translation Evaluation

no code implementations IJCNLP 2015 Francisco Guzman, Shafiq Joty, Lluis Marquez, Preslav Nakov

We present a novel framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation.

Machine Translation Sentence Embeddings +1

Towards Constructing a Corpus for Studying the Effects of Treatments and Substances Reported in PubMed Abstracts

no code implementations4 Dec 2019 Evgeni Stefchov, Galia Angelova, Preslav Nakov

We present the construction of an annotated corpus of PubMed abstracts reporting about positive, negative or neutral effects of treatments or substances.

text-classification Text Classification

DiscoTK: Using Discourse Structure for Machine Translation Evaluation

no code implementations WS 2014 Shafiq Joty, Francisco Guzman, Lluis Marquez, Preslav Nakov

We present novel automatic metrics for machine translation evaluation that use discourse structure and convolution kernels to compare the discourse tree of an automatic translation with that of the human reference.

Machine Translation Translation

Language-Independent Sentiment Analysis Using Subjectivity and Positional Information

no code implementations28 Nov 2019 Veselin Raychev, Preslav Nakov

We describe a novel language-independent approach to the task of determining the polarity, positive or negative, of the author's opinion on a specific topic in natural language text.

General Classification Sentiment Analysis

Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web as a Corpus

no code implementations27 Nov 2019 Su Nam Kim, Preslav Nakov

We employ bootstrapping and web statistics, and utilize the relationship between NCs and paraphrasing patterns to jointly extract NCs and such patterns in multiple alternating iterations.

SemEval-2015 Task 3: Answer Selection in Community Question Answering

no code implementations SEMEVAL 2015 Preslav Nakov, Lluís Màrquez, Walid Magdy, Alessandro Moschitti, James Glass, Bilal Randeree

Community Question Answering (cQA) provides new interesting research directions to the traditional Question Answering (QA) field, e. g., the exploitation of the interaction between users and the structure of related posts.

Answer Selection Community Question Answering

Using the Web as an Implicit Training Set: Application to Noun Compound Syntax and Semantics

no code implementations23 Nov 2019 Preslav Nakov

I address noun compound semantics by automatically generating paraphrasing verbs and prepositions that make explicit the hidden semantic relations between the nouns in a noun compound.

Information Retrieval Machine Translation +4

Paraphrasing Verbs for Noun Compound Interpretation

no code implementations20 Nov 2019 Preslav Nakov

An important challenge for the automatic analysis of English written text is the abundance of noun compounds: sequences of nouns acting as a single noun.

Natural Language Inference

Global Thread-Level Inference for Comment Classification in Community Question Answering

no code implementations EMNLP 2015 Shafiq Joty, Alberto Barrón-Cedeño, Giovanni Da San Martino, Simone Filice, Lluís Màrquez, Alessandro Moschitti, Preslav Nakov

Community question answering, a recent evolution of question answering in the Web context, allows a user to quickly consult the opinion of a number of people on a particular topic, thus taking advantage of the wisdom of the crowd.

Community Question Answering General Classification

A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages

no code implementations19 Nov 2019 Minh-Thang Luong, Preslav Nakov, Min-Yen Kan

We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process.

Machine Translation Translation

In Search of Credible News

1 code implementation19 Nov 2019 Momchil Hardalov, Ivan Koychev, Preslav Nakov

As this is an understudied problem, especially for languages other than English, we first collect and release to the research community three new balanced credible vs. fake news datasets derived from four online sources.

Experiments in Detecting Persuasion Techniques in the News

no code implementations15 Nov 2019 Seunghak Yu, Giovanni Da San Martino, Preslav Nakov

Many recent political events, like the 2016 US Presidential elections or the 2018 Brazilian elections have raised the attention of institutions and of the general public on the role of Internet and social media in influencing the outcome of these events.

Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda Detection

no code implementations WS 2019 Giovanni Da San Martino, Alberto Barrón-Cedeño, Preslav Nakov

FLC is a fragment-level task that asks for the identification of propagandist text fragments in a news article and also for the prediction of the specific propaganda technique used in each such fragment (18-way classification task).

Binary Classification General Classification +1

Predicting the Leading Political Ideology of YouTube Channels Using Acoustic, Textual, and Metadata Information

1 code implementation20 Oct 2019 Yoan Dinkov, Ahmed Ali, Ivan Koychev, Preslav Nakov

Our analysis shows that the use of acoustic signal helped to improve bias detection by more than 6% absolute over using text and metadata only.

Bias Detection Multimodal Deep Learning

Predicting the Role of Political Trolls in Social Media

1 code implementation CONLL 2019 Atanas Atanasov, Gianmarco De Francisci Morales, Preslav Nakov

In particular, we show how to classify trolls according to their political role ---left, news feed, right--- by using features extracted from social media, i. e., Twitter, in two scenarios: (i) in a traditional supervised learning scenario, where labels for trolls are available, and (ii) in a distant supervision scenario, where labels for trolls are not available, and we rely on more-commonly-available labels for news outlets mentioned by the trolls.

Contrastive Language Adaptation for Cross-Lingual Stance Detection

no code implementations IJCNLP 2019 Mitra Mohtarami, James Glass, Preslav Nakov

In particular, we introduce a novel contrastive language adaptation approach applied to memory networks, which ensures accurate alignment of stances in the source and target languages, and can effectively deal with the challenge of limited labeled data in the target language.

Stance Detection

Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite

2 code implementations IJCNLP 2019 Prathyusha Jwalapuram, Shafiq Joty, Irina Temnikova, Preslav Nakov

The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations.

Machine Translation Test +1

Fact-Checking Meets Fauxtography: Verifying Claims About Images

1 code implementation IJCNLP 2019 Dimitrina Zlatkova, Preslav Nakov, Ivan Koychev

The recent explosion of false claims in social media and on the Web in general has given rise to a lot of manual fact-checking initiatives.

Fact Checking

A Morpho-Syntactically Informed LSTM-CRF Model for Named Entity Recognition

no code implementations RANLP 2019 Lilia Simeonova, Kiril Simov, Petya Osenova, Preslav Nakov

We propose a morphologically informed model for named entity recognition, which is based on LSTM-CRF architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information.

named-entity-recognition Named Entity Recognition +3

Detecting Toxicity in News Articles: Application to Bulgarian

1 code implementation RANLP 2019 Yoan Dinkov, Ivan Koychev, Preslav Nakov

Online media aim for reaching ever bigger audience and for attracting ever longer attention span.

Automatic Fact-Checking Using Context and Discourse Information

1 code implementation4 Aug 2019 Pepa Atanasova, Preslav Nakov, Lluís Màrquez, Alberto Barrón-Cedeño, Georgi Karadzhov, Tsvetomila Mihaylova, Mitra Mohtarami, James Glass

We study the problem of automatic fact-checking, paying special attention to the impact of contextual and discourse information.

Fact Checking

Predicting the Topical Stance of Media and Popular Twitter Users

no code implementations2 Jul 2019 Peter Stefanov, Kareem Darwish, Atanas Atanasov, Preslav Nakov

Discovering the stances of media outlets and influential people on current, debatable topics is important for social statisticians and policy makers.

Recursive Style Breach Detection with Multifaceted Ensemble Learning

no code implementations17 Jun 2019 Daniel Kopev, Dimitrina Zlatkova, Kristiyan Mitov, Atanas Atanasov, Momchil Hardalov, Ivan Koychev, Preslav Nakov

We present a supervised approach for style change detection, which aims at predicting whether there are changes in the style in a given text document, as well as at finding the exact positions where such changes occur.

Change Detection Ensemble Learning +1

One Size Does Not Fit All: Comparing NMT Representations of Different Granularities

no code implementations NAACL 2019 Nadir Durrani, Fahim Dalvi, Hassan Sajjad, Yonatan Belinkov, Preslav Nakov

Recent work has shown that contextualized word representations derived from neural machine translation are a viable alternative to such from simple word predictions tasks.

Machine Translation NMT +1

Evaluating Variable-Length Multiple-Option Lists in Chatbots and Mobile Search

no code implementations25 May 2019 Pepa Atanasova, Georgi Karadzhov, Yasen Kiprov, Preslav Nakov, Fabrizio Sebastiani

While typically a user would expect a single response at any utterance, a system could also return multiple options for the user to select from, based on different system understandings of the user's intent.

Question Answering

Unsupervised User Stance Detection on Twitter

2 code implementations3 Apr 2019 Kareem Darwish, Peter Stefanov, Michaël J. Aupetit, Preslav Nakov

We experiment with different combinations of user similarity features, dataset sizes, dimensionality reduction methods, and clustering algorithms to ascertain the most effective and most computationally efficient combinations across three different datasets (in English and Turkish).

Social and Information Networks 62P25, 91D30

Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media

no code implementations NAACL 2019 Ramy Baly, Georgi Karadzhov, Abdelrhman Saleh, James Glass, Preslav Nakov

In the context of fake news, bias, and propaganda, we study two important but relatively under-explored problems: (i) trustworthiness estimation (on a 3-point scale) and (ii) political ideology detection (left/right bias on a 7-point scale) of entire news outlets, as opposed to evaluating individual articles.

SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)

1 code implementation SEMEVAL 2019 Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, Ritesh Kumar

We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval).

Language Identification

Rotational Unit of Memory: A Novel Representation Unit for RNNs with Scalable Applications

no code implementations TACL 2019 Rumen Dangovski, Li Jing, Preslav Nakov, Mi{\'c}o Tatalovi{\'c}, Marin Solja{\v{c}}i{\'c}

Stacking long short-term memory (LSTM) cells or gated recurrent units (GRUs) as part of a recurrent neural network (RNN) has become a standard approach to solving a number of tasks ranging from language modeling to text summarization.

Language Modelling Text Summarization

Machine Reading Comprehension for Answer Re-Ranking in Customer Support Chatbots

no code implementations12 Feb 2019 Momchil Hardalov, Ivan Koychev, Preslav Nakov

Recent advances in deep neural networks, language modeling and language generation have introduced new ideas to the field of conversational agents.

Information Retrieval Language Modelling +4

Joint Multitask Learning for Community Question Answering Using Task-Specific Embeddings

no code implementations EMNLP 2018 Shafiq Joty, Lluis Marquez, Preslav Nakov

We address jointly two important tasks for Question Answering in community forums: given a new question, (i) find related existing questions, and (ii) find relevant answers to this new question.

Community Question Answering

Adversarial Domain Adaptation for Duplicate Question Detection

1 code implementation EMNLP 2018 Darsh J Shah, Tao Lei, Alessandro Moschitti, Salvatore Romeo, Preslav Nakov

We address the problem of detecting duplicate questions in forums, which is an important step towards automating the process of answering new questions.

Domain Adaptation Question Similarity

Towards Automated Customer Support

1 code implementation2 Sep 2018 Momchil Hardalov, Ivan Koychev, Preslav Nakov

Recent years have seen growing interest in conversational agents, such as chatbots, which are a very good fit for automated customer support because the domain in which they need to operate is narrow.

Information Retrieval Machine Translation +3