Continual Pretraining
22 papers with code • 3 benchmarks • 3 datasets
This task has no description! Would you like to contribute one?
Libraries
Use these libraries to find Continual Pretraining models and implementationsMost implemented papers
Retrieval Head Mechanistically Explains Long-Context Factuality
Despite the recent progress in long-context language models, it remains elusive how transformer-based models exhibit the capability to retrieve relevant information from arbitrary locations within the long context.
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning
Low-rank adaptation is a popular parameter-efficient fine-tuning method for large language models.