Search Results for author: Punyajoy Saha

Found 22 papers, 15 papers with code

On Zero-Shot Counterspeech Generation by LLMs

1 code implementation22 Mar 2024 Punyajoy Saha, Aalok Agrawal, Abhik Jana, Chris Biemann, Animesh Mukherjee

In terms of prompting, we find that our proposed strategies help in improving counter speech generation across all the models.

InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks

no code implementations22 Feb 2024 Somnath Banerjee, Maulindu Sarkar, Punyajoy Saha, Binny Mathew, Animesh Mukherjee

Second, in a dataset extension exercise, using influence functions to automatically identify data points that have been initially `silver' annotated by some existing method and need to be cross-checked (and corrected) by annotators to improve the model performance.

Sarcasm Detection Stance Classification

Zero shot VLMs for hate meme detection: Are we there yet?

no code implementations19 Feb 2024 Naquee Rizwan, Paramananda Bhaskar, Mithun Das, Swadhin Satyaprakash Majhi, Punyajoy Saha, Animesh Mukherjee

In this study, we aim to investigate the efficacy of these visual language models in handling intricate tasks such as hate meme detection.

Zero-Shot Learning

Low-Resource Counterspeech Generation for Indic Languages: The Case of Bengali and Hindi

no code implementations11 Feb 2024 Mithun Das, Saurabh Kumar Pandey, Shivansh Sethi, Punyajoy Saha, Animesh Mukherjee

With the rise of online abuse, the NLP community has begun investigating the use of neural architectures to generate counterspeech that can "counter" the vicious tone of such abusive speech and dilute/ameliorate their rippling effect over the social network.

Probing LLMs for hate speech detection: strengths and vulnerabilities

no code implementations19 Oct 2023 Sarthak Roy, Ashish Harshavardhan, Animesh Mukherjee, Punyajoy Saha

Recently efforts have been made by social media platforms as well as researchers to detect hateful or toxic language using large language models.

Hate Speech Detection

HateMM: A Multi-Modal Dataset for Hate Video Classification

1 code implementation6 May 2023 Mithun Das, Rohit Raj, Punyajoy Saha, Binny Mathew, Manish Gupta, Animesh Mukherjee

Hate speech has become one of the most significant issues in modern society, having implications in both the online and the offline world.

Classification Hate Speech Detection +1

On the rise of fear speech in online social media

1 code implementation18 Mar 2023 Punyajoy Saha, Kiran Garimella, Narla Komal Kalyan, Saurabh Kumar Pandey, Pauras Mangesh Meher, Binny Mathew, Animesh Mukherjee

Recently, social media platforms are heavily moderated to prevent the spread of online hate speech, which is usually fertile in toxic words and is directed toward an individual or a community.

HateProof: Are Hateful Meme Detection Systems really Robust?

no code implementations11 Feb 2023 Piush Aggarwal, Pranit Chawla, Mithun Das, Punyajoy Saha, Binny Mathew, Torsten Zesch, Animesh Mukherjee

Empirically, we find a noticeable performance drop of as high as 10% in the macro-F1 score for certain attacks.

Contrastive Learning

Rationale-Guided Few-Shot Classification to Detect Abusive Language

1 code implementation30 Nov 2022 Punyajoy Saha, Divyanshu Sheth, Kushal Kedia, Binny Mathew, Animesh Mukherjee

We introduce two rationale-integrated BERT-based architectures (the RGFS models) and evaluate our systems over five different abusive language datasets, finding that in the few-shot classification setting, RGFS-based models outperform baseline models by about 7% in macro F1 scores and perform competitively to models finetuned on other source domains.

Abusive Language Classification +1

Hate Speech and Offensive Language Detection in Bengali

1 code implementation7 Oct 2022 Mithun Das, Somnath Banerjee, Punyajoy Saha, Animesh Mukherjee

To overcome the existing research's limitations, in this study, we develop an annotated dataset of 10K Bengali posts consisting of 5K actual and 5K Romanized Bengali tweets.

Hate Speech Detection

CounterGeDi: A controllable approach to generate polite, detoxified and emotional counterspeech

1 code implementation9 May 2022 Punyajoy Saha, Kanishk Singh, Adarsh Kumar, Binny Mathew, Animesh Mukherjee

We generate counterspeech using three datasets and observe significant improvement across different attribute scores.

Attribute

HateCheckHIn: Evaluating Hindi Hate Speech Detection Models

1 code implementation LREC 2022 Mithun Das, Punyajoy Saha, Binny Mathew, Animesh Mukherjee

To enable more targeted diagnostic insights of such multilingual hate speech models, we introduce a set of functionalities for the purpose of evaluation.

Hate Speech Detection

Abusive and Threatening Language Detection in Urdu using Boosting based and BERT based models: A Comparative Approach

1 code implementation27 Nov 2021 Mithun Das, Somnath Banerjee, Punyajoy Saha

In this FIRE 2021 shared task - "HASOC- Abusive and Threatening language detection in Urdu" the organizers propose an abusive language detection dataset in Urdu along with threatening language detection.

Abusive Language

"Short is the Road that Leads from Fear to Hate": Fear Speech in Indian WhatsApp Groups

2 code implementations7 Feb 2021 Punyajoy Saha, Binny Mathew, Kiran Garimella, Animesh Mukherjee

We observe that users writing fear speech messages use various events and symbols to create the illusion of fear among the reader about a target community.

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

6 code implementations18 Dec 2020 Binny Mathew, Punyajoy Saha, Seid Muhie Yimam, Chris Biemann, Pawan Goyal, Animesh Mukherjee

We also observe that models, which utilize the human rationales for training, perform better in reducing unintended bias towards target communities.

Hate Speech Detection Text Classification

HateMonitors: Language Agnostic Abuse Detection in Social Media

1 code implementation27 Sep 2019 Punyajoy Saha, Binny Mathew, Pawan Goyal, Animesh Mukherjee

In this paper, we present our machine learning model, HateMonitor, developed for Hate Speech and Offensive Content Identification in Indo-European Languages (HASOC), a shared task at FIRE 2019.

Abuse Detection Abusive Language +1

Cannot find the paper you are looking for? You can Submit a new open access paper.