Continual Pretraining
7 papers with code • 1 benchmarks • 1 datasets
Most implemented papers
Continual Training of Language Models for Few-Shot Learning
Recent work on applying large language models (LMs) achieves impressive performance in many NLP applications.
ECONET: Effective Continual Pretraining of Language Models for Event Temporal Reasoning
While pre-trained language models (PTLMs) have achieved noticeable success on many NLP tasks, they still struggle for tasks that require event temporal reasoning, which is essential for event-centric applications.
Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning
We introduce EfficientCL, a memory-efficient continual pretraining method that applies contrastive learning with novel data augmentation and curriculum learning.
On the Robustness of Reading Comprehension Models to Entity Renaming
We study the robustness of machine reading comprehension (MRC) models to entity renaming -- do models make more wrong predictions when the same questions are asked about an entity whose name has been changed?
Fortunately, Discourse Markers Can Enhance Language Models for Sentiment Analysis
In recent years, pretrained language models have revolutionized the NLP world, while achieving state of the art performance in various downstream tasks.
Hierarchical Label-wise Attention Transformer Model for Explainable ICD Coding
In this study, we propose a hierarchical label-wise attention Transformer model (HiLAT) for the explainable prediction of ICD codes from clinical documents.
Continual Pre-Training Mitigates Forgetting in Language and Vision
We formalize and investigate the characteristics of the continual pre-training scenario in both language and vision environments, where a model is continually pre-trained on a stream of incoming data and only later fine-tuned to different downstream tasks.