Search Results for author: Mikhail S. Burtsev

Found 4 papers, 3 papers with code

Memory Transformer

1 code implementation • 20 Jun 2020 • Mikhail S. Burtsev, Yuri Kuratov, Anton Peganov, Grigory V. Sapunov

Adding trainable memory to selectively store local as well as global representations of a sequence is a promising direction to improve the Transformer model.

Language Modelling Machine Translation +4

4,124

Paper
Code

Recurrent Memory Transformer

3 code implementations • 14 Jul 2022 • Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev

We implement a memory mechanism with no changes to Transformer model by adding special memory tokens to the input or output sequence.

Language Modelling

741

Paper
Code

Scaling Transformer to 1M tokens and beyond with RMT

3 code implementations • 19 Apr 2023 • Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev

A major limitation for the broader scope of problems solvable by transformers is the quadratic scaling of computational complexity with input size.

Language Modelling Natural Language Understanding +1

741

Paper
Code

Continual and Multi-task Reinforcement Learning With Shared Episodic Memory

no code implementations • 7 May 2019 • Artyom Y. Sorokin, Mikhail S. Burtsev

Episodic memory plays an important role in the behavior of animals and humans.

Continual Learning reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.