DH-FBK at SemEval-2022 Task 4: Leveraging Annotators’ Disagreement and Multiple Data Views for Patronizing Language Detection

The subtle and typically unconscious use of patronizing and condescending language (PCL) in large-audience media outlets undesirably feeds stereotypes and strengthens power-knowledge relationships, perpetuating discrimination towards vulnerable communities. Due to its subjective and subtle nature, PCL detection is an open and challenging problem, both for computational methods and human annotators. In this paper we describe the systems submitted by the DH-FBK team to SemEval-2022 Task 4, aiming at detecting PCL towards vulnerable communities in English media texts. Motivated by the subjectivity of human interpretation, we propose to leverage annotators’ uncertainty and disagreement to better capture the shades of PCL in a multi-task, multi-view learning framework. Our approach achieves competitive results, largely outperforming baselines and ranking on the top-left side of the leaderboard on both PCL identification and classification. Noticeably, our approach does not rely on any external data or model ensemble, making it a viable and attractive solution for real-world use.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Binary Condescension Detection DPM MTMW (UNC+SPAN) F1-score 57.89 # 3
Multi-label Condescension Detection DPM MTMW (AGR+COU+SPAN) Macro-F1 37.35 # 3

Methods


No methods listed for this paper. Add relevant methods here