Linear Warmup With Cosine Annealing is a learning rate schedule where we increase the learning rate linearly for $n$ updates and then anneal according to a cosine schedule afterwards.
Paper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 105 | 14.62% |
Large Language Model | 52 | 7.24% |
Question Answering | 40 | 5.57% |
Retrieval | 31 | 4.32% |
Prompt Engineering | 29 | 4.04% |
Text Generation | 22 | 3.06% |
Decision Making | 17 | 2.37% |
Benchmarking | 16 | 2.23% |
Code Generation | 16 | 2.23% |