Continual Pretraining

22 papers with code • 3 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?


Use these libraries to find Continual Pretraining models and implementations

Most implemented papers

Retrieval Head Mechanistically Explains Long-Context Factuality

nightdessert/retrieval_head 24 Apr 2024

Despite the recent progress in long-context language models, it remains elusive how transformer-based models exhibit the capability to retrieve relevant information from arbitrary locations within the long context.

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

kongds/mora 20 May 2024

Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models.