no code implementations • 16 Apr 2024 • Haozheng Fan, Hao Zhou, Guangtai Huang, Parameswaran Raman, Xinwei Fu, Gaurav Gupta, Dhananjay Ram, Yida Wang, Jun Huan
In this paper, we showcase HLAT: a 7 billion parameter decoder-only LLM pre-trained using trn1 instances over 1. 8 trillion tokens.
no code implementations • 30 Nov 2023 • Dan Song, Xinwei Fu, Weizhi Nie, Wenhui Li, Lanjun Wang, You Yang, AnAn Liu
Consequently, this paper aims to improve the confidence with view selection and hierarchical prompts.