Search Results for author: Chaoyu Gong

Found 2 papers, 1 papers with code

Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning

no code implementations24 Feb 2024 Yong liu, Zirui Zhu, Chaoyu Gong, Minhao Cheng, Cho-Jui Hsieh, Yang You

While fine-tuning large language models (LLMs) for specific tasks often yields impressive results, it comes at the cost of memory inefficiency due to back-propagation in gradient-based training.

RTE

An Efficient 2D Method for Training Super-Large Deep Learning Models

1 code implementation12 Apr 2021 Qifan Xu, Shenggui Li, Chaoyu Gong, Yang You

However, due to memory constraints, model parallelism must be utilized to host large models that would otherwise not fit into the memory of a single device.

Cannot find the paper you are looking for? You can Submit a new open access paper.