Search Results for author: Mengdi Zhao

Found 2 papers, 1 papers with code

ReTok: Replacing Tokenizer to Enhance Representation Efficiency in Large Language Model

no code implementations6 Oct 2024 Shuhao Gu, Mengdi Zhao, BoWen Zhang, Liangdong Wang, Jijie Li, Guang Liu

In this work, we propose a method to improve model representation and processing efficiency by replacing the tokenizers of LLMs.

Language Modelling Large Language Model

AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies

1 code implementation13 Aug 2024 Bo-Wen Zhang, Liangdong Wang, Ye Yuan, Jijie Li, Shuhao Gu, Mengdi Zhao, Xinya Wu, Guang Liu, ChengWei Wu, Hanyu Zhao, Li Du, Yiming Ju, Quanyue Ma, Yulong Ao, Yingli Zhao, Songhe Zhu, Zhou Cao, Dong Liang, Yonghua Lin, Ming Zhang, Shunfei Wang, Yanxin Zhou, Min Ye, Xuekai Chen, Xinyang Yu, Xiangjun Huang, Jian Yang

In this paper, we present AquilaMoE, a cutting-edge bilingual 8*16B Mixture of Experts (MoE) language model that has 8 experts with 16 billion parameters each and is developed using an innovative training methodology called EfficientScale.

Language Modelling Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.