Search Results for author: Mikhail S. Burtsev

Found 4 papers, 3 papers with code

Memory Transformer

1 code implementation20 Jun 2020 Mikhail S. Burtsev, Yuri Kuratov, Anton Peganov, Grigory V. Sapunov

Adding trainable memory to selectively store local as well as global representations of a sequence is a promising direction to improve the Transformer model.

Language Modelling Machine Translation +4

Recurrent Memory Transformer

3 code implementations14 Jul 2022 Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev

We implement a memory mechanism with no changes to Transformer model by adding special memory tokens to the input or output sequence.

Language Modelling

Scaling Transformer to 1M tokens and beyond with RMT

3 code implementations19 Apr 2023 Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev

A major limitation for the broader scope of problems solvable by transformers is the quadratic scaling of computational complexity with input size.

Language Modelling Natural Language Understanding +1

Cannot find the paper you are looking for? You can Submit a new open access paper.