no code implementations • EMNLP 2021 • Yang Liu, Hua Cheng, Russell Klopfer, Matthew R. Gormley, Thomas Schaaf
Multi-label document classification (MLDC) problems can be challenging, especially for long documents with a large label set and a long-tail distribution over labels.
Ranked #2 on Medical Code Prediction on MIMIC-III
no code implementations • 30 Nov 2022 • John Glover, Federico Fancellu, Vasudevan Jagannathan, Matthew R. Gormley, Thomas Schaaf
In this paper we systematically compare different granularities of decomposition -- from document to sub-sentence level, and we show that the answer is no.
1 code implementation • 21 Nov 2022 • Arindam Ghosh, Thomas Schaaf, Matthew R. Gormley
In this paper, we propose a calibration-aware adaptive focal loss called AdaFocal that utilizes the calibration properties of focal (and inverse-focal) loss and adaptively modifies $\gamma_t$ for different groups of samples based on $\gamma_{t-1}$ from the previous step and the knowledge of model's under/over-confidence on the validation set.
1 code implementation • Findings (EMNLP) 2021 • Longxiang Zhang, Renato Negrinho, Arindam Ghosh, Vasudevan Jagannathan, Hamid Reza Hassanzadeh, Thomas Schaaf, Matthew R. Gormley
We show that fluent and adequate summaries can be generated with limited training data by fine-tuning BART on a specially constructed dataset.
1 code implementation • ACL 2020 • Taehee Jung, Dongyeop Kang, Hua Cheng, Lucas Mentch, Thomas Schaaf
Here we propose an end-to-end training procedure called posterior calibrated (PosCal) training that directly optimizes the objective while minimizing the difference between the predicted and empirical posterior probabilities. We show that PosCal not only helps reduce the calibration error but also improve task performance by penalizing drops in performance of both objectives.