no code implementations • 11 Sep 2024 • Hanyu Zhao, Li Du, Yiming Ju, ChengWei Wu, Tengfei Pan
With the availability of various instruction datasets, a pivotal challenge is how to effectively select and integrate these instructions to fine-tune large language models (LLMs).
1 code implementation • 13 Aug 2024 • Bo-Wen Zhang, Liangdong Wang, Ye Yuan, Jijie Li, Shuhao Gu, Mengdi Zhao, Xinya Wu, Guang Liu, ChengWei Wu, Hanyu Zhao, Li Du, Yiming Ju, Quanyue Ma, Yulong Ao, Yingli Zhao, Songhe Zhu, Zhou Cao, Dong Liang, Yonghua Lin, Ming Zhang, Shunfei Wang, Yanxin Zhou, Min Ye, Xuekai Chen, Xinyang Yu, Xiangjun Huang, Jian Yang
In this paper, we present AquilaMoE, a cutting-edge bilingual 8*16B Mixture of Experts (MoE) language model that has 8 experts with 16 billion parameters each and is developed using an innovative training methodology called EfficientScale.
1 code implementation • 8 Mar 2024 • Hongda Sun, Yuxuan Liu, ChengWei Wu, Haiyu Yan, Cheng Tai, Xin Gao, Shuo Shang, Rui Yan
Open-domain question answering (ODQA) has emerged as a pivotal research spotlight in information systems.
no code implementations • 16 Apr 2023 • Wendong Zhang, Qingjie Chai, Quanqi Zhang, ChengWei Wu
Therefore, this paper proposes an Obstacle-Transformer to predict trajectory in a constant inference time.
no code implementations • 26 Oct 2020 • Liang Hu, ChengWei Wu, Wei Pan
An actor-critic reinforcement learning algorithm is proposed to learn the state estimator approximated by a deep neural network.