Search Results for author: Chunshan Li

Found 1 papers, 0 papers with code

Checkpoint Merging via Bayesian Optimization in LLM Pretraining

no code implementations28 Mar 2024 Deyuan Liu, Zecheng Wang, Bingning Wang, WeiPeng Chen, Chunshan Li, Zhiying Tu, Dianhui Chu, Bo Li, Dianbo Sui

The rapid proliferation of large language models (LLMs) such as GPT-4 and Gemini underscores the intense demand for resources during their training processes, posing significant challenges due to substantial computational and environmental costs.

Bayesian Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.