Search Results for author: Yaroslav Aksenov

Found 2 papers, 1 papers with code

Learn Your Reference Model for Real Good Alignment

no code implementations15 Apr 2024 Alexey Gorbatovski, Boris Shaposhnikov, Alexey Malakhov, Nikita Surnachev, Yaroslav Aksenov, Ian Maksimov, Nikita Balagansky, Daniil Gavrilov

For instance, in the fundamental Reinforcement Learning From Human Feedback (RLHF) technique of Language Model alignment, in addition to reward maximization, the Kullback-Leibler divergence between the trainable policy and the SFT policy is minimized.

Language Modelling

Linear Transformers with Learnable Kernel Functions are Better In-Context Models

2 code implementations16 Feb 2024 Yaroslav Aksenov, Nikita Balagansky, Sofia Maria Lo Cicero Vaina, Boris Shaposhnikov, Alexey Gorbatovski, Daniil Gavrilov

Advancing the frontier of subquadratic architectures for Language Models (LMs) is crucial in the rapidly evolving field of natural language processing.

In-Context Learning Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.