no code implementations • RepL4NLP (ACL) 2022 • Daniil Gavrilov, Nikita Balagansky
Currently, pre-trained models can be considered the default choice for a wide range of NLP tasks.
2 code implementations • 16 Feb 2024 • Yaroslav Aksenov, Nikita Balagansky, Sofia Maria Lo Cicero Vaina, Boris Shaposhnikov, Alexey Gorbatovski, Daniil Gavrilov
Advancing the frontier of subquadratic architectures for Language Models (LMs) is crucial in the rapidly evolving field of natural language processing.
no code implementations • 18 May 2023 • Daniil Gavrilov, Nikita Balagansky
In this paper, we propose Ahead-of-Time (AoT) P-Tuning, a novel parameter-efficient fine-tuning method for pre-trained Language Models (LMs) that adds input-dependent bias before each Transformer layer.
no code implementations • 18 May 2023 • Sofia Maria Lo Cicero Vaina, Nikita Balagansky, Daniil Gavrilov
We evaluate our methods on Plaid, SSD, and CDCD DLMs and create a cohesive perspective on their generation workflows.
no code implementations • 22 Nov 2022 • Mark Rofin, Nikita Balagansky, Daniil Gavrilov
The simplest way to obtain continuous interpolation between two points in high dimensional space is to draw a line between them.
no code implementations • 15 May 2022 • Askhat Sitdikov, Nikita Balagansky, Daniil Gavrilov, Alexander Markov
This paper proposes a simple method for controllable text generation based on weighting logits with a free-form classifier, namely CAIF sampling.
1 code implementation • 7 Apr 2022 • Nikita Balagansky, Daniil Gavrilov
Recently proposed PonderNet may be a promising solution for performing an early exit by treating the exit layer's index as a latent variable.
no code implementations • 14 Oct 2020 • Artem Chumachenko, Daniil Gavrilov, Nikita Balagansky, Pavel Kalaidin
We also proposed a variant of Weight Squeezing called Gated Weight Squeezing, for which we combined fine-tuning of BERT-Medium model and learning mapping from BERT-Base weights.