Search Results for author: Daniil Gavrilov

Found 12 papers, 4 papers with code

PALBERT”:" Teaching ALBERT to Ponder

no code implementations • RepL4NLP (ACL) 2022 • Daniil Gavrilov, Nikita Balagansky

Currently, pre-trained models can be considered the default choice for a wide range of NLP tasks.

Paper
Add Code

Learn Your Reference Model for Real Good Alignment

no code implementations • 15 Apr 2024 • Alexey Gorbatovski, Boris Shaposhnikov, Alexey Malakhov, Nikita Surnachev, Yaroslav Aksenov, Ian Maksimov, Nikita Balagansky, Daniil Gavrilov

For instance, in the fundamental Reinforcement Learning From Human Feedback (RLHF) technique of Language Model alignment, in addition to reward maximization, the Kullback-Leibler divergence between the trainable policy and the SFT policy is minimized.

Language Modelling

Paper
Add Code

Linear Transformers with Learnable Kernel Functions are Better In-Context Models

2 code implementations • 16 Feb 2024 • Yaroslav Aksenov, Nikita Balagansky, Sofia Maria Lo Cicero Vaina, Boris Shaposhnikov, Alexey Gorbatovski, Daniil Gavrilov

Advancing the frontier of subquadratic architectures for Language Models (LMs) is crucial in the rapidly evolving field of natural language processing.

In-Context Learning Language Modelling

434

Paper
Code

Diffusion Language Models Generation Can Be Halted Early

no code implementations • 18 May 2023 • Sofia Maria Lo Cicero Vaina, Nikita Balagansky, Daniil Gavrilov

We evaluate our methods on Plaid, SSD, and CDCD DLMs and create a cohesive perspective on their generation workflows.

Language Modelling Text Generation

Paper
Add Code

Ahead-of-Time P-Tuning

no code implementations • 18 May 2023 • Daniil Gavrilov, Nikita Balagansky

In this paper, we propose Ahead-of-Time (AoT) P-Tuning, a novel parameter-efficient fine-tuning method for pre-trained Language Models (LMs) that adds input-dependent bias before each Transformer layer.

Benchmarking

Paper
Add Code

Linear Interpolation In Parameter Space is Good Enough for Fine-Tuned Language Models

no code implementations • 22 Nov 2022 • Mark Rofin, Nikita Balagansky, Daniil Gavrilov

The simplest way to obtain continuous interpolation between two points in high dimensional space is to draw a line between them.

Attribute Text Generation

Paper
Add Code

Classifiers are Better Experts for Controllable Text Generation

no code implementations • 15 May 2022 • Askhat Sitdikov, Nikita Balagansky, Daniil Gavrilov, Alexander Markov

This paper proposes a simple method for controllable text generation based on weighting logits with a free-form classifier, namely CAIF sampling.

Text Generation

Paper
Add Code

PALBERT: Teaching ALBERT to Ponder

1 code implementation • 7 Apr 2022 • Nikita Balagansky, Daniil Gavrilov

Recently proposed PonderNet may be a promising solution for performing an early exit by treating the exit layer's index as a latent variable.

Paper
Code

FastRPB: a Scalable Relative Positional Encoding for Long Sequence Tasks

1 code implementation • 23 Feb 2022 • Maksim Zubkov, Daniil Gavrilov

In this paper, we explore Linear Transformer models, rethinking their two core components.

Paper
Code

Implicit Unlikelihood Training: Improving Neural Text Generation with Reinforcement Learning

1 code implementation • EACL 2021 • Evgeny Lagutin, Daniil Gavrilov, Pavel Kalaidin

Likelihood training and maximization-based decoding result in dull and repetitive generated texts even when using powerful language models (Holtzman et al., 2019).

Language Modelling reinforcement-learning +2

Paper
Code

Weight Squeezing: Reparameterization for Knowledge Transfer and Model Compression

no code implementations • 14 Oct 2020 • Artem Chumachenko, Daniil Gavrilov, Nikita Balagansky, Pavel Kalaidin

We also proposed a variant of Weight Squeezing called Gated Weight Squeezing, for which we combined fine-tuning of BERT-Medium model and learning mapping from BERT-Base weights.

General Classification Model Compression +3

Paper
Add Code

Self-Attentive Model for Headline Generation

no code implementations • 23 Jan 2019 • Daniil Gavrilov, Pavel Kalaidin, Valentin Malykh

Headline generation is a special type of text summarization task.

Headline Generation Text Summarization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.