Search Results for author: Preslav Nakov

Found 235 papers, 77 papers with code

Team Alex at CLEF CheckThat! 2020: Identifying Check-Worthy Tweets With Transformer Models

3 code implementations7 Sep 2020 Alex Nikolov, Giovanni Da San Martino, Ivan Koychev, Preslav Nakov

While misinformation and disinformation have been thriving in social media for years, with the emergence of the COVID-19 pandemic, the political and the health misinformation merged, thus elevating the problem to a whole new level and giving rise to the first global infodemic.

Fact Checking Misinformation

Leaf: Multiple-Choice Question Generation

1 code implementation22 Jan 2022 Kristiyan Vachev, Momchil Hardalov, Georgi Karadzhov, Georgi Georgiev, Ivan Koychev, Preslav Nakov

Testing with quiz questions has proven to be an effective way to assess and improve the educational process.

Multiple-choice Question Answering +2

On the Effect of Dropping Layers of Pre-trained Transformer Models

4 code implementations8 Apr 2020 Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov

Transformer-based NLP models are trained using hundreds of millions or even billions of parameters, limiting their applicability in computationally constrained environments.

Knowledge Distillation Sentence +1

Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs

1 code implementation25 Aug 2023 Yuxia Wang, Haonan Li, Xudong Han, Preslav Nakov, Timothy Baldwin

With the rapid evolution of large language models (LLMs), new and hard-to-predict harmful capabilities are emerging.

FANG: Leveraging Social Context for Fake News Detection Using Graph Representation

1 code implementation18 Aug 2020 Van-Hoang Nguyen, Kazunari Sugiyama, Preslav Nakov, Min-Yen Kan

In particular, FANG yields significant improvements for the task of fake news detection, and it is robust in the case of limited training data.

Fake News Detection Representation Learning

Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications

1 code implementation1 Feb 2023 Muhammad Arslan Manzoor, Sarah Albarri, Ziting Xian, Zaiqiao Meng, Preslav Nakov, Shangsong Liang

This survey presents the comprehensive literature on the evolution and enhancement of deep learning multimodal architectures to deal with textual, visual and audio features for diverse cross-modal and modern multimodal tasks.

Question Answering Representation Learning +3

Fact-Checking Complex Claims with Program-Guided Reasoning

1 code implementation22 May 2023 Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min-Yen Kan, Preslav Nakov

Fact-checking real-world claims often requires collecting multiple pieces of evidence and applying complex multi-step reasoning.

Fact Checking In-Context Learning

EXAMS: A Multi-Subject High School Examinations Dataset for Cross-Lingual and Multilingual Question Answering

2 code implementations EMNLP 2020 Momchil Hardalov, Todor Mihaylov, Dimitrina Zlatkova, Yoan Dinkov, Ivan Koychev, Preslav Nakov

We perform various experiments with existing top-performing multilingual pre-trained models and we show that EXAMS offers multiple challenges that require multilingual knowledge and reasoning in multiple domains.

Question Answering Transfer Learning

Detecting and Understanding Harmful Memes: A Survey

1 code implementation9 May 2022 Shivam Sharma, Firoj Alam, Md. Shad Akhtar, Dimitar Dimitrov, Giovanni Da San Martino, Hamed Firooz, Alon Halevy, Fabrizio Silvestri, Preslav Nakov, Tanmoy Chakraborty

One interesting finding is that many types of harmful memes are not really studied, e. g., such featuring self-harm and extremism, partly due to the lack of suitable datasets.

SemEval-2021 Task 6: Detection of Persuasion Techniques in Texts and Images

1 code implementation SEMEVAL 2021 Dimitar Dimitrov, Bishr Bin Ali, Shaden Shaar, Firoj Alam, Fabrizio Silvestri, Hamed Firooz, Preslav Nakov, Giovanni Da San Martino

We describe SemEval-2021 task 6 on Detection of Persuasion Techniques in Texts and Images: the data, the annotation guidelines, the evaluation setup, the results, and the participating systems.

Adversarial Domain Adaptation for Duplicate Question Detection

1 code implementation EMNLP 2018 Darsh J Shah, Tao Lei, Alessandro Moschitti, Salvatore Romeo, Preslav Nakov

We address the problem of detecting duplicate questions in forums, which is an important step towards automating the process of answering new questions.

Domain Adaptation Question Similarity

Faking Fake News for Real Fake News Detection: Propaganda-loaded Training Data Generation

1 code implementation10 Mar 2022 Kung-Hsiang Huang, Kathleen McKeown, Preslav Nakov, Yejin Choi, Heng Ji

Despite recent advances in detecting fake news generated by neural models, their results are not readily applicable to effective detection of human-written disinformation.

Fake News Detection Natural Language Inference +1

Overview of CheckThat! 2020: Automatic Identification and Verification of Claims in Social Media

3 code implementations15 Jul 2020 Alberto Barron-Cedeno, Tamer Elsayed, Preslav Nakov, Giovanni Da San Martino, Maram Hasanain, Reem Suwaileh, Fatima Haouari, Nikolay Babulkov, Bayan Hamdan, Alex Nikolov, Shaden Shaar, Zien Sheikh Ali

The first four tasks compose the full pipeline of claim verification in social media: Task 1 on check-worthiness estimation, Task 2 on retrieving previously fact-checked claims, Task 3 on evidence retrieval, and Task 4 on claim verification.

Claim Verification Retrieval +1

RuleBert: Teaching Soft Rules to Pre-trained Language Models

1 code implementation EMNLP 2021 Mohammed Saeed, Naser Ahmadi, Preslav Nakov, Paolo Papotti

While pre-trained language models (PLMs) are the go-to solution to tackle many natural language processing problems, they are still very limited in their ability to capture and to use common-sense knowledge.

Common Sense Reasoning

PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training

1 code implementation5 Nov 2022 Zihui Gu, Ju Fan, Nan Tang, Preslav Nakov, Xiaoman Zhao, Xiaoyong Du

In particular, on the complex set of TabFact, which contains multiple operations, PASTA largely outperforms the previous state of the art by 4. 7 points (85. 6% vs. 80. 9%), and the gap between PASTA and human performance on the small TabFact test set is narrowed to just 1. 5 points (90. 6% vs. 92. 1%).

Fact Checking Fact Verification +5

DetectLLM: Leveraging Log Rank Information for Zero-Shot Detection of Machine-Generated Text

1 code implementation23 May 2023 Jinyan Su, Terry Yue Zhuo, Di Wang, Preslav Nakov

One is called DetectLLM-LRR, which is fast and efficient, and the other is called DetectLLM-NPR, which is more accurate, but slower due to the need for perturbations.

Misinformation

Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training

1 code implementation13 Sep 2021 Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

Most research in stance detection, however, has been limited to working with a single language and on a few limited targets, with little work on cross-lingual stance detection.

Stance Detection

Detecting Propaganda Techniques in Memes

1 code implementation ACL 2021 Dimitar Dimitrov, Bishr Bin Ali, Shaden Shaar, Firoj Alam, Fabrizio Silvestri, Hamed Firooz, Preslav Nakov, Giovanni Da San Martino

We further create and release a new corpus of 950 memes, carefully annotated with 22 propaganda techniques, which can appear in the text, in the image, or in both.

SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables

1 code implementation22 May 2023 Xinyuan Lu, Liangming Pan, Qian Liu, Preslav Nakov, Min-Yen Kan

Current scientific fact-checking benchmarks exhibit several shortcomings, such as biases arising from crowd-sourced claims and an over-reliance on text-based evidence.

Claim Verification Fact Checking

A Context-Aware Approach for Detecting Worth-Checking Claims in Political Debates

1 code implementation RANLP 2017 Pepa Gencheva, Preslav Nakov, Llu{\'\i}s M{\`a}rquez, Alberto Barr{\'o}n-Cede{\~n}o, Ivan Koychev

In the context of investigative journalism, we address the problem of automatically identifying which claims in a given document are most worthy and should be prioritized for fact-checking.

Fact Checking

Unsupervised User Stance Detection on Twitter

2 code implementations3 Apr 2019 Kareem Darwish, Peter Stefanov, Michaël J. Aupetit, Preslav Nakov

We experiment with different combinations of user similarity features, dataset sizes, dimensionality reduction methods, and clustering algorithms to ascertain the most effective and most computationally efficient combinations across three different datasets (in English and Turkish).

Social and Information Networks 62P25, 91D30

Cross-Domain Label-Adaptive Stance Detection

1 code implementation EMNLP 2021 Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

In this paper, we perform an in-depth analysis of 16 stance detection datasets, and we explore the possibility for cross-domain learning from them.

Domain Adaptation Stance Detection

Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms

1 code implementation15 Jul 2020 Firoj Alam, Fahim Dalvi, Shaden Shaar, Nadir Durrani, Hamdy Mubarak, Alex Nikolov, Giovanni Da San Martino, Ahmed Abdelali, Hassan Sajjad, Kareem Darwish, Preslav Nakov

With the outbreak of the COVID-19 pandemic, people turned to social media to read and to share timely information including statistics, warnings, advice, and inspirational stories.

Misinformation

QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking

1 code implementation11 Oct 2023 Liangming Pan, Xinyuan Lu, Min-Yen Kan, Preslav Nakov

Fact-checking real-world claims often requires complex, multi-step reasoning due to the absence of direct evidence to support or refute them.

Decision Making Fact Checking +1

Lost in Translation, Found in Spans: Identifying Claims in Multilingual Social Media

1 code implementation27 Oct 2023 Shubham Mittal, Megha Sundriyal, Preslav Nakov

Claim span identification (CSI) is an important step in fact-checking pipelines, aiming to identify text segments that contain a checkworthy claim or assertion in a social media post.

Cross-Lingual Transfer Fact Checking +1

The Case for Being Average: A Mediocrity Approach to Style Masking and Author Obfuscation

2 code implementations12 Jul 2017 Georgi Karadjov, Tsvetomila Mihaylova, Yasen Kiprov, Georgi Georgiev, Ivan Koychev, Preslav Nakov

Users posting online expect to remain anonymous unless they have logged in, which is often needed for them to be able to discuss freely on various topics.

Fact Checking in Community Forums

3 code implementations8 Mar 2018 Tsvetomila Mihaylova, Preslav Nakov, Lluis Marquez, Alberto Barron-Cedeno, Mitra Mohtarami, Georgi Karadzhov, James Glass

Community Question Answering (cQA) forums are very popular nowadays, as they represent effective means for communities around particular topics to share information.

Community Question Answering Fact Checking

Automatic Fact-Checking Using Context and Discourse Information

1 code implementation4 Aug 2019 Pepa Atanasova, Preslav Nakov, Lluís Màrquez, Alberto Barrón-Cedeño, Georgi Karadzhov, Tsvetomila Mihaylova, Mitra Mohtarami, James Glass

We study the problem of automatic fact-checking, paying special attention to the impact of contextual and discourse information.

Fact Checking

On the Risk of Misinformation Pollution with Large Language Models

1 code implementation23 May 2023 Yikang Pan, Liangming Pan, Wenhu Chen, Preslav Nakov, Min-Yen Kan, William Yang Wang

In this paper, we comprehensively investigate the potential misuse of modern Large Language Models (LLMs) for generating credible-sounding misinformation and its subsequent impact on information-intensive applications, particularly Open-Domain Question Answering (ODQA) systems.

Misinformation Open-Domain Question Answering

Predicting the Leading Political Ideology of YouTube Channels Using Acoustic, Textual, and Metadata Information

1 code implementation20 Oct 2019 Yoan Dinkov, Ahmed Ali, Ivan Koychev, Preslav Nakov

Our analysis shows that the use of acoustic signal helped to improve bias detection by more than 6% absolute over using text and metadata only.

Bias Detection Multimodal Deep Learning

Fully Automated Fact Checking Using External Sources

1 code implementation RANLP 2017 Georgi Karadzhov, Preslav Nakov, Lluis Marquez, Alberto Barron-Cedeno, Ivan Koychev

Given the constantly growing proliferation of false claims online in recent years, there has been also a growing research interest in automatically distinguishing false rumors from factually true claims.

Community Question Answering Fact Checking

In Search of Credible News

1 code implementation19 Nov 2019 Momchil Hardalov, Ivan Koychev, Preslav Nakov

As this is an understudied problem, especially for languages other than English, we first collect and release to the research community three new balanced credible vs. fake news datasets derived from four online sources.

bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark

2 code implementations4 Jun 2023 Momchil Hardalov, Pepa Atanasova, Todor Mihaylov, Galia Angelova, Kiril Simov, Petya Osenova, Ves Stoyanov, Ivan Koychev, Preslav Nakov, Dragomir Radev

We run the first systematic evaluation of pre-trained language models for Bulgarian, comparing and contrasting results across the nine tasks in the benchmark.

Fact Checking named-entity-recognition +5

We Built a Fake News & Click-bait Filter: What Happened Next Will Blow Your Mind!

1 code implementation10 Mar 2018 Georgi Karadzhov, Pepa Gencheva, Preslav Nakov, Ivan Koychev

So, we did this research on fake news/click-bait detection and trust us, it is totally great research, it really is!

On a Novel Application of Wasserstein-Procrustes for Unsupervised Cross-Lingual Learning

1 code implementation18 Jul 2020 Guillem Ramírez, Rumen Dangovski, Preslav Nakov, Marin Soljačić

We believe that our rethinking of the Wasserstein-Procrustes problem could enable further research, thus helping to develop better algorithms for aligning word embeddings across languages.

Word Embeddings

SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)

2 code implementations SEMEVAL 2019 Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, Ritesh Kumar

We present the results and the main findings of SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval).

Language Identification

CrowdChecked: Detecting Previously Fact-Checked Claims in Social Media

1 code implementation10 Oct 2022 Momchil Hardalov, Anton Chernyavskiy, Ivan Koychev, Dmitry Ilvovsky, Preslav Nakov

Thus, an interesting approach has emerged: to perform automatic fact-checking by verifying whether an input claim has been previously fact-checked by professional fact-checkers and to return back an article that explains their decision.

Fact Checking

IITD at the WANLP 2022 Shared Task: Multilingual Multi-Granularity Network for Propaganda Detection

1 code implementation31 Oct 2022 Shubham Mittal, Preslav Nakov

In addition to finding the techniques, Subtask 2 further asks to identify the textual span for each instance of each technique that is present in the tweet; the task can be modeled as a sequence tagging problem.

Multi-Label Classification Propaganda detection +1

A Template Is All You Meme

1 code implementation11 Nov 2023 Luke Bates, Peter Ebert Christensen, Preslav Nakov, Iryna Gurevych

Here, to aid understanding of memes, we release a knowledge base of memes and information found on www. knowyourmeme. com, which we call the Know Your Meme Knowledge Base (KYMKB), composed of more than 54, 000 images.

On the Impact of Seed Words on Sentiment Polarity Lexicon Induction

1 code implementation COLING 2016 Dame Jovanoski, Veno Pachovski, Preslav Nakov

Sentiment polarity lexicons are key resources for sentiment analysis, and researchers have invested a lot of efforts in their manual creation.

Sentiment Analysis Text Classification

AraStance: A Multi-Country and Multi-Domain Dataset of Arabic Stance Detection for Fact Checking

1 code implementation NAACL (NLP4IF) 2021 Tariq Alhindi, Amal Alabdulkarim, Ali Alshehri, Muhammad Abdul-Mageed, Preslav Nakov

With the continuing spread of misinformation and disinformation online, it is of increasing importance to develop combating mechanisms at scale in the form of automated systems that support multiple languages.

Fact Checking Misinformation +1

Towards Automated Customer Support

1 code implementation2 Sep 2018 Momchil Hardalov, Ivan Koychev, Preslav Nakov

Recent years have seen growing interest in conversational agents, such as chatbots, which are a very good fit for automated customer support because the domain in which they need to operate is narrow.

Information Retrieval Machine Translation +3

Detecting Toxicity in News Articles: Application to Bulgarian

1 code implementation RANLP 2019 Yoan Dinkov, Ivan Koychev, Preslav Nakov

Online media aim for reaching ever bigger audience and for attracting ever longer attention span.

Evaluating Pronominal Anaphora in Machine Translation: An Evaluation Measure and a Test Suite

2 code implementations IJCNLP 2019 Prathyusha Jwalapuram, Shafiq Joty, Irina Temnikova, Preslav Nakov

The ongoing neural revolution in machine translation has made it easier to model larger contexts beyond the sentence-level, which can potentially help resolve some discourse-level ambiguities such as pronominal anaphora, thus enabling better translations.

Machine Translation Sentence +1

Detecting the Role of an Entity in Harmful Memes: Techniques and Their Limitations

1 code implementation CONSTRAINT (ACL) 2022 Rabindra Nath Nandi, Firoj Alam, Preslav Nakov

The content that is posted and shared online can be textual, visual, or a combination of both, e. g., in a meme.

From Chaos to Clarity: Claim Normalization to Empower Fact-Checking

1 code implementation22 Oct 2023 Megha Sundriyal, Tanmoy Chakraborty, Preslav Nakov

To evaluate the effectiveness of our proposed model, we meticulously compile a comprehensive real-world dataset, CLAN, comprising more than 6k instances of social media posts alongside their respective normalized claims.

Fact Checking In-Context Learning

Assisting the Human Fact-Checkers: Detecting All Previously Fact-Checked Claims in a Document

1 code implementation14 Sep 2021 Shaden Shaar, Nikola Georgiev, Firoj Alam, Giovanni Da San Martino, Aisha Mohamed, Preslav Nakov

The output is a re-ranked list of the document sentences, so that those that can be verified are ranked as high as possible, together with corresponding evidence.

Fact Checking Learning-To-Rank +2

DISARM: Detecting the Victims Targeted by Harmful Memes

1 code implementation Findings (NAACL) 2022 Shivam Sharma, Md. Shad Akhtar, Preslav Nakov, Tanmoy Chakraborty

Finally, we show that DISARM is interpretable and comparatively more generalizable and that it can reduce the relative error rate for harmful target identification by up to 9 points absolute over several strong multimodal rivals.

Named Entity Recognition Named Entity Recognition (NER) +1

Detecting Propaganda Techniques in Code-Switched Social Media Text

1 code implementation23 May 2023 Muhammad Umar Salman, Asif Hanif, Shady Shehata, Preslav Nakov

Yet, it is common to find a mix of multiple languages in social media communication, a phenomenon known as code-switching.

Propaganda detection

Generating Zero-shot Abstractive Explanations for Rumour Verification

1 code implementation23 Jan 2024 Iman Munire Bilal, Preslav Nakov, Rob Procter, Maria Liakata

The task of rumour verification in social media concerns assessing the veracity of a claim on the basis of conversation threads that result from it.

Few-Shot Learning Informativeness +2

Integrating Stance Detection and Fact Checking in a Unified Corpus

no code implementations NAACL 2018 Ramy Baly, Mitra Mohtarami, James Glass, Lluis Marquez, Alessandro Moschitti, Preslav Nakov

A reasonable approach for fact checking a claim involves retrieving potentially relevant documents from different sources (e. g., news websites, social media, etc.

Fact Checking Retrieval +1

Automatic Stance Detection Using End-to-End Memory Networks

no code implementations NAACL 2018 Mitra Mohtarami, Ramy Baly, James Glass, Preslav Nakov, Lluis Marquez, Alessandro Moschitti

We present a novel end-to-end memory network for stance detection, which jointly (i) predicts whether a document agrees, disagrees, discusses or is unrelated with respect to a given target claim, and also (ii) extracts snippets of evidence for that prediction.

Stance Detection

Machine Translation Evaluation with Neural Networks

no code implementations5 Oct 2017 Francisco Guzmán, Shafiq R. Joty, Lluís Màrquez, Preslav Nakov

We present a framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation.

Machine Translation Sentence +1

Discourse Structure in Machine Translation Evaluation

no code implementations CL 2017 Shafiq Joty, Francisco Guzmán, Lluís Màrquez, Preslav Nakov

In this article, we explore the potential of using sentence-level discourse structure for machine translation evaluation.

Machine Translation Sentence +1

Semantic Sentiment Analysis of Twitter Data

no code implementations4 Oct 2017 Preslav Nakov

Internet and the proliferation of smart mobile devices have changed the way information is created, shared, and spreads, e. g., microblogs such as Twitter, weblogs such as LiveJournal, social networks such as Facebook, and instant messengers such as Skype and WhatsApp are now commonly used to share thoughts and opinions about anything in the surrounding world.

Marketing Sentiment Analysis

Cross-Language Question Re-Ranking

no code implementations4 Oct 2017 Giovanni Da San Martino, Salvatore Romeo, Alberto Barron-Cedeno, Shafiq Joty, Lluis Marquez, Alessandro Moschitti, Preslav Nakov

We compare a kernel-based system with a feed-forward neural network in a scenario where a large parallel corpus is available for training a machine translation system, bilingual dictionaries, and cross-language word embeddings.

Machine Translation Re-Ranking +1

Robust Tuning Datasets for Statistical Machine Translation

no code implementations RANLP 2017 Preslav Nakov, Stephan Vogel

We explore the idea of automatically crafting a tuning dataset for Statistical Machine Translation (SMT) that makes the hyper-parameters of the SMT system more robust with respect to some specific deficiencies of the parameter tuning algorithms.

Machine Translation Sentence +1

Large-Scale Goodness Polarity Lexicons for Community Question Answering

no code implementations20 Jul 2017 Todor Mihaylov, Daniel Belchev, Yasen Kiprov, Ivan Koychev, Preslav Nakov

This leads us to the idea to build a good/bad polarity lexicon as an analogy to the positive/negative sentiment polarity lexicons, commonly used in sentiment analysis.

Community Question Answering Sentiment Analysis

Cross-language Learning with Adversarial Neural Networks: Application to Community Question Answering

no code implementations21 Jun 2017 Shafiq Joty, Preslav Nakov, Lluís Màrquez, Israa Jaradat

We address the problem of cross-language adaptation for question-question similarity reranking in community question answering, with the objective to port a system trained on one input language to another input language given labeled training data for the first language and only unlabeled data for the second language.

Community Question Answering Question Similarity

Bi-Text Alignment of Movie Subtitles for Spoken English-Arabic Statistical Machine Translation

no code implementations5 Sep 2016 Fahad Al-Obaidli, Stephen Cox, Preslav Nakov

In particular, we look at movie subtitles as a unique, rich resource, as subtitles in one language often get translated into other languages.

Machine Translation Translation

Joint Multitask Learning for Community Question Answering Using Task-Specific Embeddings

no code implementations EMNLP 2018 Shafiq Joty, Lluis Marquez, Preslav Nakov

We address jointly two important tasks for Question Answering in community forums: given a new question, (i) find related existing questions, and (ii) find relevant answers to this new question.

Community Question Answering

Cross-language Learning with Adversarial Neural Networks

no code implementations CONLL 2017 Shafiq Joty, Preslav Nakov, Llu{\'\i}s M{\`a}rquez, Israa Jaradat

We address the problem of cross-language adaptation for question-question similarity reranking in community question answering, with the objective to port a system trained on one input language to another input language given labeled training data for the first language and only unlabeled data for the second language.

Community Question Answering Domain Adaptation +3

Findings of the VarDial Evaluation Campaign 2017

no code implementations WS 2017 Marcos Zampieri, Shervin Malmasi, Nikola Ljube{\v{s}}i{\'c}, Preslav Nakov, Ahmed Ali, J{\"o}rg Tiedemann, Yves Scherrer, No{\"e}mi Aepli

We present the results of the VarDial Evaluation Campaign on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects, which we organized as part of the fourth edition of the VarDial workshop at EACL{'}2017.

Dependency Parsing Dialect Identification

Negation and Modality in Machine Translation

no code implementations WS 2016 Preslav Nakov

In particular, I will demonstrate how contemporary MT systems fail on them, and I will discuss some possible solutions.

Machine Translation Negation +3

Do Not Trust the Trolls: Predicting Credibility in Community Question Answering Forums

1 code implementation RANLP 2017 Preslav Nakov, Tsvetomila Mihaylova, Llu{\'\i}s M{\`a}rquez, Yashkumar Shiroya, Ivan Koychev

We address information credibility in community forums, in a setting in which the credibility of an answer posted in a question thread by a particular user has to be predicted.

Community Question Answering Information Retrieval

Machine Reading Comprehension for Answer Re-Ranking in Customer Support Chatbots

no code implementations12 Feb 2019 Momchil Hardalov, Ivan Koychev, Preslav Nakov

Recent advances in deep neural networks, language modeling and language generation have introduced new ideas to the field of conversational agents.

Information Retrieval Language Modelling +4

Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media

no code implementations NAACL 2019 Ramy Baly, Georgi Karadzhov, Abdelrhman Saleh, James Glass, Preslav Nakov

In the context of fake news, bias, and propaganda, we study two important but relatively under-explored problems: (i) trustworthiness estimation (on a 3-point scale) and (ii) political ideology detection (left/right bias on a 7-point scale) of entire news outlets, as opposed to evaluating individual articles.

Evaluating Variable-Length Multiple-Option Lists in Chatbots and Mobile Search

no code implementations25 May 2019 Pepa Atanasova, Georgi Karadzhov, Yasen Kiprov, Preslav Nakov, Fabrizio Sebastiani

While typically a user would expect a single response at any utterance, a system could also return multiple options for the user to select from, based on different system understandings of the user's intent.

Question Answering

One Size Does Not Fit All: Comparing NMT Representations of Different Granularities

no code implementations NAACL 2019 Nadir Durrani, Fahim Dalvi, Hassan Sajjad, Yonatan Belinkov, Preslav Nakov

Recent work has shown that contextualized word representations derived from neural machine translation are a viable alternative to such from simple word predictions tasks.

Machine Translation NMT +1

Rotational Unit of Memory: A Novel Representation Unit for RNNs with Scalable Applications

no code implementations TACL 2019 Rumen Dangovski, Li Jing, Preslav Nakov, Mi{\'c}o Tatalovi{\'c}, Marin Solja{\v{c}}i{\'c}

Stacking long short-term memory (LSTM) cells or gated recurrent units (GRUs) as part of a recurrent neural network (RNN) has become a standard approach to solving a number of tasks ranging from language modeling to text summarization.

Language Modelling Text Summarization

Recursive Style Breach Detection with Multifaceted Ensemble Learning

no code implementations17 Jun 2019 Daniel Kopev, Dimitrina Zlatkova, Kristiyan Mitov, Atanas Atanasov, Momchil Hardalov, Ivan Koychev, Preslav Nakov

We present a supervised approach for style change detection, which aims at predicting whether there are changes in the style in a given text document, as well as at finding the exact positions where such changes occur.

Change Detection Ensemble Learning +1

Predicting the Topical Stance of Media and Popular Twitter Users

no code implementations2 Jul 2019 Peter Stefanov, Kareem Darwish, Atanas Atanasov, Preslav Nakov

Discovering the stances of media outlets and influential people on current, debatable topics is important for social statisticians and policy makers.

A Morpho-Syntactically Informed LSTM-CRF Model for Named Entity Recognition

no code implementations RANLP 2019 Lilia Simeonova, Kiril Simov, Petya Osenova, Preslav Nakov

We propose a morphologically informed model for named entity recognition, which is based on LSTM-CRF architecture and combines word embeddings, Bi-LSTM character embeddings, part-of-speech (POS) tags, and morphological information.

named-entity-recognition Named Entity Recognition +3

Fact-Checking Meets Fauxtography: Verifying Claims About Images

1 code implementation IJCNLP 2019 Dimitrina Zlatkova, Preslav Nakov, Ivan Koychev

The recent explosion of false claims in social media and on the Web in general has given rise to a lot of manual fact-checking initiatives.

Fact Checking

Predicting the Role of Political Trolls in Social Media

1 code implementation CONLL 2019 Atanas Atanasov, Gianmarco De Francisci Morales, Preslav Nakov

In particular, we show how to classify trolls according to their political role ---left, news feed, right--- by using features extracted from social media, i. e., Twitter, in two scenarios: (i) in a traditional supervised learning scenario, where labels for trolls are available, and (ii) in a distant supervision scenario, where labels for trolls are not available, and we rely on more-commonly-available labels for news outlets mentioned by the trolls.

Contrastive Language Adaptation for Cross-Lingual Stance Detection

no code implementations IJCNLP 2019 Mitra Mohtarami, James Glass, Preslav Nakov

In particular, we introduce a novel contrastive language adaptation approach applied to memory networks, which ensures accurate alignment of stances in the source and target languages, and can effectively deal with the challenge of limited labeled data in the target language.

Stance Detection

Findings of the NLP4IF-2019 Shared Task on Fine-Grained Propaganda Detection

no code implementations WS 2019 Giovanni Da San Martino, Alberto Barrón-Cedeño, Preslav Nakov

FLC is a fragment-level task that asks for the identification of propagandist text fragments in a news article and also for the prediction of the specific propaganda technique used in each such fragment (18-way classification task).

Binary Classification General Classification +2

Experiments in Detecting Persuasion Techniques in the News

no code implementations15 Nov 2019 Seunghak Yu, Giovanni Da San Martino, Preslav Nakov

Many recent political events, like the 2016 US Presidential elections or the 2018 Brazilian elections have raised the attention of institutions and of the general public on the role of Internet and social media in influencing the outcome of these events.

A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages

no code implementations19 Nov 2019 Minh-Thang Luong, Preslav Nakov, Min-Yen Kan

We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process.

Machine Translation Sentence +1

Paraphrasing Verbs for Noun Compound Interpretation

no code implementations20 Nov 2019 Preslav Nakov

An important challenge for the automatic analysis of English written text is the abundance of noun compounds: sequences of nouns acting as a single noun.

Natural Language Inference Sentence

Global Thread-Level Inference for Comment Classification in Community Question Answering

no code implementations EMNLP 2015 Shafiq Joty, Alberto Barrón-Cedeño, Giovanni Da San Martino, Simone Filice, Lluís Màrquez, Alessandro Moschitti, Preslav Nakov

Community question answering, a recent evolution of question answering in the Web context, allows a user to quickly consult the opinion of a number of people on a particular topic, thus taking advantage of the wisdom of the crowd.

Community Question Answering General Classification

SemEval-2015 Task 3: Answer Selection in Community Question Answering

no code implementations SEMEVAL 2015 Preslav Nakov, Lluís Màrquez, Walid Magdy, Alessandro Moschitti, James Glass, Bilal Randeree

Community Question Answering (cQA) provides new interesting research directions to the traditional Question Answering (QA) field, e. g., the exploitation of the interaction between users and the structure of related posts.

Answer Selection Community Question Answering

Large-Scale Noun Compound Interpretation Using Bootstrapping and the Web as a Corpus

no code implementations27 Nov 2019 Su Nam Kim, Preslav Nakov

We employ bootstrapping and web statistics, and utilize the relationship between NCs and paraphrasing patterns to jointly extract NCs and such patterns in multiple alternating iterations.

DiscoTK: Using Discourse Structure for Machine Translation Evaluation

no code implementations WS 2014 Shafiq Joty, Francisco Guzman, Lluis Marquez, Preslav Nakov

We present novel automatic metrics for machine translation evaluation that use discourse structure and convolution kernels to compare the discourse tree of an automatic translation with that of the human reference.

Machine Translation Translation

Language-Independent Sentiment Analysis Using Subjectivity and Positional Information

no code implementations28 Nov 2019 Veselin Raychev, Preslav Nakov

We describe a novel language-independent approach to the task of determining the polarity, positive or negative, of the author's opinion on a specific topic in natural language text.

Attribute General Classification +3

Using the Web as an Implicit Training Set: Application to Noun Compound Syntax and Semantics

no code implementations23 Nov 2019 Preslav Nakov

I address noun compound semantics by automatically generating paraphrasing verbs and prepositions that make explicit the hidden semantic relations between the nouns in a noun compound.

Information Retrieval Machine Translation +4

Towards Constructing a Corpus for Studying the Effects of Treatments and Substances Reported in PubMed Abstracts

no code implementations4 Dec 2019 Evgeni Stefchov, Galia Angelova, Preslav Nakov

We present the construction of an annotated corpus of PubMed abstracts reporting about positive, negative or neutral effects of treatments or substances.

Sentence text-classification +1

Machine Translation Evaluation Meets Community Question Answering

no code implementations ACL 2016 Francisco Guzmán, Lluís Màrquez, Preslav Nakov

We explore the applicability of machine translation evaluation (MTE) methods to a very different problem: answer ranking in community Question Answering.

Community Question Answering Machine Translation +1

Pairwise Neural Machine Translation Evaluation

no code implementations IJCNLP 2015 Francisco Guzman, Shafiq Joty, Lluis Marquez, Preslav Nakov

We present a novel framework for machine translation evaluation using neural networks in a pairwise setting, where the goal is to select the better translation from a pair of hypotheses, given the reference translation.

Machine Translation Sentence +2

Proppy: A System to Unmask Propaganda in Online News

no code implementations14 Dec 2019 Alberto Barrón-Cedeño, Giovanni Da San Martino, Israa Jaradat, Preslav Nakov

We present proppy, the first publicly available real-world, real-time propaganda detection system for online news, which aims at raising awareness, thus potentially limiting the impact of propaganda and helping fight disinformation.

Propaganda detection

SemEval-2013 Task 2: Sentiment Analysis in Twitter

no code implementations SEMEVAL 2013 Preslav Nakov, Zornitsa Kozareva, Alan Ritter, Sara Rosenthal, Veselin Stoyanov, Theresa Wilson

To address this issue, we have proposed SemEval-2013 Task 2: Sentiment Analysis in Twitter, which included two subtasks: A, an expression-level subtask, and B, a message-level subtask.

Sentiment Analysis Task 2

A Context-Aware Approach for Detecting Check-Worthy Claims in Political Debates

no code implementations14 Dec 2019 Pepa Gencheva, Ivan Koychev, Lluís Màrquez, Alberto Barrón-Cedeño, Preslav Nakov

In the context of investigative journalism, we address the problem of automatically identifying which claims in a given document are most worthy and should be prioritized for fact-checking.

Fact Checking

Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

no code implementations27 Feb 2020 Prakhar Ganesh, Yao Chen, Xin Lou, Mohammad Ali Khan, Yin Yang, Hassan Sajjad, Preslav Nakov, Deming Chen, Marianne Winslett

Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks.

Model Compression

Enriched Pre-trained Transformers for Joint Slot Filling and Intent Detection

no code implementations30 Apr 2020 Momchil Hardalov, Ivan Koychev, Preslav Nakov

Recently, the advances in pre-trained language models, namely contextualized models such as ELMo and BERT have revolutionized the field by tapping the potential of training very large models with just a few steps of fine-tuning on a task-specific dataset.

Intent Detection Natural Language Understanding +2

SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification

no code implementations Findings (ACL) 2021 Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Marcos Zampieri, Preslav Nakov

The widespread use of offensive content in social media has led to an abundance of research in detecting language such as hate speech, cyberbullying, and cyber-aggression.

Language Identification

Predicting the Topical Stance and Political Leaning of Media using Tweets

no code implementations ACL 2020 Peter Stefanov, Kareem Darwish, Atanas Atanasov, Preslav Nakov

Discovering the stances of media outlets and influential people on current, debatable topics is important for social statisticians and policy makers.

A Survey on Computational Propaganda Detection

no code implementations15 Jul 2020 Giovanni Da San Martino, Stefano Cresci, Alberto Barron-Cedeno, Seunghak Yu, Roberto Di Pietro, Preslav Nakov

Propaganda campaigns aim at influencing people's mindset with the purpose of advancing a specific agenda.

Propaganda detection

Can We Spot the "Fake News" Before It Was Even Written?

no code implementations10 Aug 2020 Preslav Nakov

Given the recent proliferation of disinformation online, there has been also growing research interest in automatically debunking rumors, false claims, and "fake news."

Fact Checking

Detecting Harmful Content On Online Platforms: What Platforms Need Vs. Where Research Efforts Go

no code implementations27 Feb 2021 Arnav Arora, Preslav Nakov, Momchil Hardalov, Sheikh Muhammad Sarwar, Vibha Nayak, Yoan Dinkov, Dimitrina Zlatkova, Kyle Dent, Ameya Bhatawdekar, Guillaume Bouchard, Isabelle Augenstein

The proliferation of harmful content on online platforms is a major societal problem, which comes in many different forms including hate speech, offensive language, bullying and harassment, misinformation, spam, violence, graphic content, sexual abuse, self harm, and many other.

Abusive Language Misinformation

A Survey on Stance Detection for Mis- and Disinformation Identification

no code implementations Findings (NAACL) 2022 Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

Understanding attitudes expressed in texts, also known as stance detection, plays an important role in systems for detecting false information online, be it misinformation (unintentionally false) or disinformation (intentionally false information).

Fact Checking Misinformation +3

Automated Fact-Checking for Assisting Human Fact-Checkers

no code implementations13 Mar 2021 Preslav Nakov, David Corney, Maram Hasanain, Firoj Alam, Tamer Elsayed, Alberto Barrón-Cedeño, Paolo Papotti, Shaden Shaar, Giovanni Da San Martino

The reporting and the analysis of current events around the globe has expanded from professional, editor-lead journalism all the way to citizen journalism.

Fact Checking

A Survey on Predicting the Factuality and the Bias of News Media

no code implementations16 Mar 2021 Preslav Nakov, Husrev Taha Sencar, Jisun An, Haewoon Kwak

The present level of proliferation of fake, biased, and propagandistic content online has made it impossible to fact-check every single suspicious claim or article, either manually or automatically.

Bias Detection Fact Checking +1

Transformers: "The End of History" for NLP?

no code implementations9 Apr 2021 Anton Chernyavskiy, Dmitry Ilvovsky, Preslav Nakov

Recent advances in neural architectures, such as the Transformer, coupled with the emergence of large-scale pre-trained models such as BERT, have revolutionized the field of Natural Language Processing (NLP), pushing the state of the art for a number of NLP tasks.

Fact-Checking, Fake News, Propaganda, and Media Bias: Truth Seeking in the Post-Truth Era

no code implementations EMNLP 2020 Preslav Nakov, Giovanni Da San Martino

The rise of social media has democratized content creation and has made it easy for everybody to share and spread information online.

Fact Checking Misinformation

Predicting the Factuality of Reporting of News Media Using Observations About User Attention in Their YouTube Channels

no code implementations RANLP 2021 Krasimira Bozhanova, Yoan Dinkov, Ivan Koychev, Maria Castaldo, Tommaso Venturini, Preslav Nakov

We propose a novel framework for predicting the factuality of reporting of news media outlets by studying the user attention cycles in their YouTube channels.

A Second Pandemic? Analysis of Fake News About COVID-19 Vaccines in Qatar

no code implementations RANLP 2021 Preslav Nakov, Firoj Alam, Shaden Shaar, Giovanni Da San Martino, Yifan Zhang

While COVID-19 vaccines are finally becoming widely available, a second pandemic that revolves around the circulation of anti-vaxxer fake news may hinder efforts to recover from the first one.

The Spread of Propaganda by Coordinated Communities on Social Media

no code implementations27 Sep 2021 Kristina Hristakieva, Stefano Cresci, Giovanni Da San Martino, Mauro Conti, Preslav Nakov

Large-scale manipulations on social media have two important characteristics: (i) use of propaganda to influence others, and (ii) adoption of coordinated behavior to spread it and to amplify its impact.

Analyzing the Use of Character-Level Translation with Sparse and Noisy Datasets

no code implementations RANLP 2013 Jörg Tiedemann, Preslav Nakov

This paper provides an analysis of character-level machine translation models used in pivot-based translation when applied to sparse and noisy datasets, such as crowdsourced movie subtitles.

Machine Translation Translation

Feature-Rich Named Entity Recognition for Bulgarian Using Conditional Random Fields

no code implementations26 Sep 2021 Georgi Georgiev, Preslav Nakov, Kuzman Ganchev, Petya Osenova, Kiril Ivanov Simov

The paper presents a feature-rich approach to the automatic recognition and categorization of named entities (persons, organizations, locations, and miscellaneous) in news text for Bulgarian.

Miscellaneous named-entity-recognition +2

Cannot find the paper you are looking for? You can Submit a new open access paper.