Linear Warmup With Cosine Annealing is a learning rate schedule where we increase the learning rate linearly for $n$ updates and then anneal according to a cosine schedule afterwards.
Paper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 67 | 9.25% |
Large Language Model | 42 | 5.80% |
Question Answering | 37 | 5.11% |
Retrieval | 29 | 4.01% |
In-Context Learning | 25 | 3.45% |
Text Generation | 23 | 3.18% |
Sentence | 20 | 2.76% |
Prompt Engineering | 19 | 2.62% |
Code Generation | 18 | 2.49% |