1 code implementation • 20 Sep 2024 • Chen Zhang, Dading Chong, Feng Jiang, Chengguang Tang, Anningzhe Gao, Guohua Tang, Haizhou Li
Building upon the FLR mechanism, we propose to automatically mine preference data from the online generations of a base policy model.
no code implementations • 9 Jul 2024 • Nan He, Weichen Xiong, Hanwen Liu, Yi Liao, Lei Ding, Kai Zhang, Guohua Tang, Xiao Han, Wei Yang
The effectiveness of large language models (LLMs) is often hindered by duplicated data in their extensive pre-training datasets.
no code implementations • 30 May 2024 • Chen Zhang, Chengguang Tang, Dading Chong, Ke Shi, Guohua Tang, Feng Jiang, Haizhou Li
This automatic mining process is efficiently accomplished through the collaboration between a large-scale teacher model and a small-scale student model.
1 code implementation • 13 Oct 2023 • Chen Zhang, Luis Fernando D'Haro, Chengguang Tang, Ke Shi, Guohua Tang, Haizhou Li
The English dialogue data are extended to nine other languages with commercial machine translation systems.