FiNLP at FinCausal 2020 Task 1: Mixture of BERTs for Causal Sentence Identification in Financial Texts

FNP (COLING) 2020 · Sarthak Gupta ·

This paper describes our system developed for the sub-task 1 of the FinCausal shared task in the FNP-FNS workshop held in conjunction with COLING-2020. The system classifies whether a financial news text segment contains causality or not. To address this task, we fine-tune and ensemble the generic and domain-specific BERT language models pre-trained on financial text corpora. The task data is highly imbalanced with the majority non-causal class; therefore, we train the models using strategies such as under-sampling, cost-sensitive learning, and data augmentation. Our best system achieves a weighted F1-score of 96.98 securing 4th position on the evaluation leaderboard. The code is available at https://github.com/sarthakTUM/fincausal

PDF Abstract