Search Results for author: Su Fong

Found 1 papers, 1 papers with code

SuperHF: Supervised Iterative Learning from Human Feedback

1 code implementation • 25 Oct 2023 • Gabriel Mukobi, Peter Chatain, Su Fong, Robert Windesheim, Gitta Kutyniok, Kush Bhatia, Silas Alberti

Here, we focus on two prevalent methods used to align these models, Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF).

Language Modelling

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.