no code implementations • 3 Apr 2024 • Longfei Yun, Yonghao Zhuang, Yao Fu, Eric P Xing, Hao Zhang
Like dense models, training MoEs requires answering the same question: given a training budget, what is the optimal allocation on the model size and number of tokens?
no code implementations • 11 Oct 2021 • Shentong Mo, Xi Fu, Chenyang Hong, Yizhen Chen, Yuxuan Zheng, Xiangru Tang, Zhiqiang Shen, Eric P Xing, Yanyan Lan
The core problem is to model how regulatory elements interact with each other and its variability across different cell types.