Continual Pretraining

22 papers with code • 3 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Continual Pretraining models and implementations

Most implemented papers

Retrieval Head Mechanistically Explains Long-Context Factuality

nightdessert/retrieval_head 24 Apr 2024

Despite the recent progress in long-context language models, it remains elusive how transformer-based models exhibit the capability to retrieve relevant information from arbitrary locations within the long context.

MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning

kongds/mora 20 May 2024

Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models.