Search Results for author: Guohua Tang

Found 4 papers, 2 papers with code

Aligning Language Models Using Follow-up Likelihood as Reward Signal

1 code implementation20 Sep 2024 Chen Zhang, Dading Chong, Feng Jiang, Chengguang Tang, Anningzhe Gao, Guohua Tang, Haizhou Li

Building upon the FLR mechanism, we propose to automatically mine preference data from the online generations of a base policy model.

Language Modelling

TS-Align: A Teacher-Student Collaborative Framework for Scalable Iterative Finetuning of Large Language Models

no code implementations30 May 2024 Chen Zhang, Chengguang Tang, Dading Chong, Ke Shi, Guohua Tang, Feng Jiang, Haizhou Li

This automatic mining process is efficiently accomplished through the collaboration between a large-scale teacher model and a small-scale student model.

Instruction Following

Cannot find the paper you are looking for? You can Submit a new open access paper.