17 papers with code • 3 benchmarks • 3 datasets
While pre-trained language models (PTLMs) have achieved noticeable success on many NLP tasks, they still struggle for tasks that require event temporal reasoning, which is essential for event-centric applications.
We introduce EfficientCL, a memory-efficient continual pretraining method that applies contrastive learning with novel data augmentation and curriculum learning.
We study the robustness of machine reading comprehension (MRC) models to entity renaming -- do models make more wrong predictions when the same questions are asked about an entity whose name has been changed?
In recent years, pretrained language models have revolutionized the NLP world, while achieving state of the art performance in various downstream tasks.
In this study, we propose a hierarchical label-wise attention Transformer model (HiLAT) for the explainable prediction of ICD codes from clinical documents.
We formalize and investigate the characteristics of the continual pre-training scenario in both language and vision environments, where a model is continually pre-trained on a stream of incoming data and only later fine-tuned to different downstream tasks.
We conducted experiments using our method on datasets with a large vocabulary gap from a source domain.
Continual pretraining is a popular way of building a domain-specific pretrained language model from a general-domain language model.