Search Results for author: Ian Maksimov

Found 1 papers, 0 papers with code

Learn Your Reference Model for Real Good Alignment

no code implementations • 15 Apr 2024 • Alexey Gorbatovski, Boris Shaposhnikov, Alexey Malakhov, Nikita Surnachev, Yaroslav Aksenov, Ian Maksimov, Nikita Balagansky, Daniil Gavrilov

For instance, in the fundamental Reinforcement Learning From Human Feedback (RLHF) technique of Language Model alignment, in addition to reward maximization, the Kullback-Leibler divergence between the trainable policy and the SFT policy is minimized.

Language Modelling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.