Linear Warmup With Cosine Annealing is a learning rate schedule where we increase the learning rate linearly for $n$ updates and then anneal according to a cosine schedule afterwards.
Paper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 70 | 9.15% |
Large Language Model | 50 | 6.54% |
Question Answering | 33 | 4.31% |
Retrieval | 27 | 3.53% |
Text Generation | 27 | 3.53% |
RAG | 26 | 3.40% |
Prompt Engineering | 20 | 2.61% |
Code Generation | 17 | 2.22% |
In-Context Learning | 15 | 1.96% |