1 code implementation • ACL 2022 • Artem Vazhentsev, Gleb Kuzmin, Artem Shelmanov, Akim Tsvigun, Evgenii Tsymbalov, Kirill Fedyanin, Maxim Panov, Alexander Panchenko, Gleb Gusev, Mikhail Burtsev, Manvel Avetisian, Leonid Zhukov
Uncertainty estimation (UE) of model predictions is a crucial step for a variety of tasks such as active learning, misclassification detection, adversarial attack detection, out-of-distribution detection, etc.
no code implementations • LREC 2022 • Anastasia Chizhikova, Sanzhar Murzakhmetov, Oleg Serikov, Tatiana Shavrina, Mikhail Burtsev
Today, natural language processing heavily relies on pre-trained large language models.
no code implementations • CODI 2021 • Denis Kuznetsov, Dmitry Evseev, Lidia Ostyakova, Oleg Serikov, Daniel Kornev, Mikhail Burtsev
Development environments for spoken dialogue systems are popular today because they enable rapid creation of the dialogue systems in times when usage of the voice AI Assistants is constantly growing.
1 code implementation • 22 Jan 2025 • Alsu Sagirova, Yuri Kuratov, Mikhail Burtsev
Multi-agent reinforcement learning (MARL) demonstrates significant progress in solving cooperative and competitive multi-agent problems in various environments.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
1 code implementation • 2 Dec 2024 • Mikhail Burtsev
Large Language Models demonstrate remarkable mathematical capabilities but at the same time struggle with abstract reasoning and planning.
1 code implementation • 5 Jul 2024 • Ivan Rodkin, Yuri Kuratov, Aydar Bulatov, Mikhail Burtsev
This paper addresses the challenge of creating a neural architecture for very long sequences that requires constant time for processing new information at each time step.
1 code implementation • 5 Jul 2024 • Petr Anokhin, Nikita Semenov, Artyom Sorokin, Dmitry Evseev, Mikhail Burtsev, Evgeny Burnaev
Advancements in the capabilities of Large Language Models (LLMs) have created a promising foundation for developing autonomous agents.
no code implementations • 20 Jun 2024 • Alsu Sagirova, Mikhail Burtsev
Even though Transformers are extensively used for Natural Language Processing tasks, especially for machine translation, they lack an explicit memory to store key concepts of processed texts.
4 code implementations • 14 Jun 2024 • Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Ivan Rodkin, Dmitry Sorokin, Artyom Sorokin, Mikhail Burtsev
The BABILong benchmark is extendable to any length to support the evaluation of new upcoming models with increased capabilities, and we provide splits up to 10 million token lengths.
2 code implementations • 16 Feb 2024 • Yuri Kuratov, Aydar Bulatov, Petr Anokhin, Dmitry Sorokin, Artyom Sorokin, Mikhail Burtsev
This paper addresses the challenge of processing long documents using generative transformer models.
1 code implementation • 29 Nov 2023 • Alsu Sagirova, Mikhail Burtsev
Conversely, the second group relies on the attention mechanism of the long input encoding model to facilitate multi-hop reasoning.
1 code implementation • 2 Nov 2023 • Alla Chepurova, Aydar Bulatov, Yuri Kuratov, Mikhail Burtsev
In this study, we propose to include node neighborhoods as additional information to improve KGC methods based on language models.
no code implementations • 13 Jun 2023 • Dmitry Karpov, Mikhail Burtsev
This article investigates the knowledge transfer from the RuQTopics dataset.
1 code implementation • 9 Jan 2023 • Akim Tsvigun, Ivan Lysenko, Danila Sedashov, Ivan Lazichny, Eldar Damirov, Vladimir Karlov, Artemy Belousov, Leonid Sanochkin, Maxim Panov, Alexander Panchenko, Mikhail Burtsev, Artem Shelmanov
Active Learning (AL) is a technique developed to reduce the amount of annotation required to achieve a certain level of machine learning model performance.
2 code implementations • 12 Nov 2022 • Shrestha Mohanty, Negar Arabzadeh, Milagro Teruel, Yuxuan Sun, Artem Zholus, Alexey Skrynnik, Mikhail Burtsev, Kavya Srinet, Aleksandr Panov, Arthur Szlam, Marc-Alexandre Côté, Julia Kiseleva
Human intelligence can remarkably adapt quickly to new tasks and environments.
1 code implementation • 1 Nov 2022 • Alexey Skrynnik, Zoya Volovikova, Marc-Alexandre Côté, Anton Voronov, Artem Zholus, Negar Arabzadeh, Shrestha Mohanty, Milagro Teruel, Ahmed Awadallah, Aleksandr Panov, Mikhail Burtsev, Julia Kiseleva
The adoption of pre-trained language models to generate action plans for embodied agents is a promising research strategy.
1 code implementation • 27 Jul 2022 • Artyom Sorokin, Nazar Buzun, Leonid Pugachev, Mikhail Burtsev
This requires to store prohibitively large intermediate data if a sequence consists of thousands or even millions elements, and as a result, makes learning of very long-term dependencies infeasible.
1 code implementation • 27 May 2022 • Julia Kiseleva, Alexey Skrynnik, Artem Zholus, Shrestha Mohanty, Negar Arabzadeh, Marc-Alexandre Côté, Mohammad Aliannejadi, Milagro Teruel, Ziming Li, Mikhail Burtsev, Maartje ter Hoeve, Zoya Volovikova, Aleksandr Panov, Yuxuan Sun, Kavya Srinet, Arthur Szlam, Ahmed Awadallah
Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions.
no code implementations • 5 May 2022 • Julia Kiseleva, Ziming Li, Mohammad Aliannejadi, Shrestha Mohanty, Maartje ter Hoeve, Mikhail Burtsev, Alexey Skrynnik, Artem Zholus, Aleksandr Panov, Kavya Srinet, Arthur Szlam, Yuxuan Sun, Marc-Alexandre Côté, Katja Hofmann, Ahmed Awadallah, Linar Abdrazakov, Igor Churin, Putra Manggala, Kata Naszadi, Michiel van der Meer, Taewoon Kim
The primary goal of the competition is to approach the problem of how to build interactive agents that learn to solve a task while provided with grounded natural language instructions in a collaborative environment.
1 code implementation • 4 May 2022 • Alina Kolesnikova, Yuri Kuratov, Vasily Konovalov, Mikhail Burtsev
We propose two simple yet effective alignment techniques to make knowledge distillation to the students with reduced vocabulary.
no code implementations • 13 Oct 2021 • Julia Kiseleva, Ziming Li, Mohammad Aliannejadi, Shrestha Mohanty, Maartje ter Hoeve, Mikhail Burtsev, Alexey Skrynnik, Artem Zholus, Aleksandr Panov, Kavya Srinet, Arthur Szlam, Yuxuan Sun, Katja Hofmann, Michel Galley, Ahmed Awadallah
Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions.
1 code implementation • EMNLP 2021 • Mohammad Aliannejadi, Julia Kiseleva, Aleksandr Chuklin, Jeffrey Dalton, Mikhail Burtsev
Enabling open-domain dialogue systems to ask clarifying questions when appropriate is an important direction for improving the quality of the system response.
1 code implementation • 21 Jul 2021 • Mikhail Burtsev, Anna Rumshisky
Transformer-based encoder-decoder models produce a fused token-wise representation after every encoder layer.
no code implementations • 31 Jan 2021 • Leonid Pugachev, Mikhail Burtsev
Recent techniques for the task of short text clustering often rely on word embeddings as a transfer learning component.
Ranked #2 on
Short Text Clustering
on Stackoverflow
no code implementations • 1 Jan 2021 • Mikhail Burtsev, Yurii Kuratov, Anton Peganov, Grigory V. Sapunov
Adding trainable memory to selectively store local as well as global representations of a sequence is a promising direction to improve the Transformer model.
3 code implementations • 23 Sep 2020 • Mohammad Aliannejadi, Julia Kiseleva, Aleksandr Chuklin, Jeff Dalton, Mikhail Burtsev
The main aim of the conversational systems is to return an appropriate answer in response to the user requests.
no code implementations • 5 Feb 2020 • Pavel Gulyaev, Eugenia Elistratova, Vasily Konovalov, Yuri Kuratov, Leonid Pugachev, Mikhail Burtsev
The organizers introduced the Schema-Guided Dialogue (SGD) dataset with multi-domain conversations and released a zero-shot dialogue state tracking model.
1 code implementation • 9 Oct 2019 • Ivan Skorokhodov, Mikhail Burtsev
We present multi-point optimization: an optimization technique that allows to train several models simultaneously without the need to keep the parameters of each one individually.
2 code implementations • 31 Jan 2019 • Emily Dinan, Varvara Logacheva, Valentin Malykh, Alexander Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, Shrimai Prabhumoye, Alan W. black, Alexander Rudnicky, Jason Williams, Joelle Pineau, Mikhail Burtsev, Jason Weston
We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots.
no code implementations • ACL 2018 • Mikhail Burtsev, Alex Seliverstov, er, Rafael Airapetyan, Mikhail Arkhipov, Dilyara Baymurzina, Nickolay Bushkov, Olga Gureenkova, Taras Khakhulin, Yuri Kuratov, Denis Kuznetsov, Alexey Litinsky, Varvara Logacheva, Alexey Lymar, Valentin Malykh, Maxim Petrov, Vadim Polulyakh, Leonid Pugachev, Alexey Sorokin, Maria Vikhreva, Marat Zaynutdinov
It supports modular as well as end-to-end approaches to implementation of conversational agents.