UCCNLP@SMM4H’22:Label distribution aware long-tailed learning with post-hoc posterior calibration applied to text classification

The paper describes our submissions for the Social Media Mining for Health (SMM4H) workshop 2022 shared tasks. We participated in 2 tasks: (1) classification of adverse drug events (ADE) mentions in english tweets (Task-1a) and (2) classification of self-reported intimate partner violence (IPV) on twitter (Task 7). We proposed an approach that uses RoBERTa (A Robustly Optimized BERT Pretraining Approach) fine-tuned with a label distribution-aware margin loss function and post-hoc posterior calibration for robust inference against class imbalance. We achieved a 4% and 1 % increase in performance on IPV and ADE respectively when compared with the traditional fine-tuning strategy with unweighted cross-entropy loss.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here