Search Results for author: Alexander Nikulin

Found 18 papers, 13 papers with code

Zero-Shot Adaptation of Behavioral Foundation Models to Unseen Dynamics

1 code implementation19 May 2025 Maksim Bobrin, Ilya Zisman, Alexander Nikulin, Vladislav Kurenkov, Dmitry Dylov

In this work, we demonstrate that Forward-Backward (FB) representation, one of the methods from the BFM family, cannot distinguish between distinct dynamics, leading to an interference among the latent directions, which parametrize different policies.

Yes, Q-learning Helps Offline In-Context RL

no code implementations24 Feb 2025 Denis Tarasov, Alexander Nikulin, Ilya Zisman, Albina Klepach, Andrei Polubarov, Nikita Lyubaykin, Alexander Derevyagin, Igor Kiselev, Vladislav Kurenkov

Existing offline in-context reinforcement learning (ICRL) methods have predominantly relied on supervised training objectives, which are known to have limitations in offline RL settings.

In-Context Reinforcement Learning MuJoCo +3

Latent Action Learning Requires Supervision in the Presence of Distractors

no code implementations1 Feb 2025 Alexander Nikulin, Ilya Zisman, Denis Tarasov, Nikita Lyubaykin, Andrei Polubarov, Igor Kiselev, Vladislav Kurenkov

Recently, latent action learning, pioneered by Latent Action Policies (LAPO), have shown remarkable pre-training efficiency on observation-only data, offering potential for leveraging vast amounts of video available on the web for embodied AI.

Vintix: Action Model via In-Context Reinforcement Learning

1 code implementation31 Jan 2025 Andrey Polubarov, Nikita Lyubaykin, Alexander Derevyagin, Ilya Zisman, Denis Tarasov, Alexander Nikulin, Vladislav Kurenkov

In-Context Reinforcement Learning (ICRL) represents a promising paradigm for developing generalist agents that learn at inference time through trial-and-error interactions, analogous to how large language models adapt contextually, but with a focus on reward maximization.

Decision Making In-Context Reinforcement Learning +2

N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs

no code implementations4 Nov 2024 Ilya Zisman, Alexander Nikulin, Andrei Polubarov, Nikita Lyubaykin, Vladislav Kurenkov

In-context learning allows models like transformers to adapt to new tasks from a few examples without updating their weights, a desirable trait for reinforcement learning (RL).

In-Context Learning Reinforcement Learning (RL)

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

no code implementations13 Jun 2024 Alexander Nikulin, Ilya Zisman, Alexey Zemtsov, Viacheslav Sinii, Vladislav Kurenkov, Sergey Kolesnikov

With this substantial effort, we aim to democratize research in the rapidly growing field of in-context reinforcement learning and provide a solid foundation for further scaling.

In-Context Learning In-Context Reinforcement Learning +2

In-Context Reinforcement Learning for Variable Action Spaces

1 code implementation20 Dec 2023 Viacheslav Sinii, Alexander Nikulin, Vladislav Kurenkov, Ilya Zisman, Sergey Kolesnikov

Recently, it has been shown that transformers pre-trained on diverse datasets with multi-episode contexts can generalize to new reinforcement learning tasks in-context.

In-Context Reinforcement Learning Multi-Armed Bandits +2

Emergence of In-Context Reinforcement Learning from Noise Distillation

1 code implementation19 Dec 2023 Ilya Zisman, Vladislav Kurenkov, Alexander Nikulin, Viacheslav Sinii, Sergey Kolesnikov

Recently, extensive studies in Reinforcement Learning have been carried out on the ability of transformers to adapt in-context to various environments and tasks.

In-Context Reinforcement Learning reinforcement-learning +1

XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAX

2 code implementations19 Dec 2023 Alexander Nikulin, Vladislav Kurenkov, Ilya Zisman, Artem Agarkov, Viacheslav Sinii, Sergey Kolesnikov

Inspired by the diversity and depth of XLand and the simplicity and minimalism of MiniGrid, we present XLand-MiniGrid, a suite of tools and grid-world environments for meta-reinforcement learning research.

Diversity Meta-Learning +3

Katakomba: Tools and Benchmarks for Data-Driven NetHack

1 code implementation NeurIPS 2023 Vladislav Kurenkov, Alexander Nikulin, Denis Tarasov, Sergey Kolesnikov

NetHack is known as the frontier of reinforcement learning research where learning-based methods still need to catch up to rule-based solutions.

D4RL NetHack +3

Revisiting the Minimalist Approach to Offline Reinforcement Learning

3 code implementations NeurIPS 2023 Denis Tarasov, Vladislav Kurenkov, Alexander Nikulin, Sergey Kolesnikov

Recent years have witnessed significant advancements in offline reinforcement learning (RL), resulting in the development of numerous algorithms with varying degrees of complexity.

D4RL Offline RL +3

Anti-Exploration by Random Network Distillation

3 code implementations31 Jan 2023 Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Sergey Kolesnikov

Despite the success of Random Network Distillation (RND) in various domains, it was shown as not discriminative enough to be used as an uncertainty estimator for penalizing out-of-distribution actions in offline reinforcement learning.

D4RL

Q-Ensemble for Offline RL: Don't Scale the Ensemble, Scale the Batch Size

2 code implementations20 Nov 2022 Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Dmitry Akimov, Sergey Kolesnikov

Training large neural networks is known to be time-consuming, with the learning duration taking days or even weeks.

Offline RL

Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows

2 code implementations20 Nov 2022 Dmitriy Akimov, Vladislav Kurenkov, Alexander Nikulin, Denis Tarasov, Sergey Kolesnikov

This Normalizing Flows action encoder is pre-trained in a supervised manner on the offline dataset, and then an additional policy model - controller in the latent space - is trained via reinforcement learning.

Offline RL reinforcement-learning +2

CORL: Research-oriented Deep Offline Reinforcement Learning Library

5 code implementations NeurIPS 2023 Denis Tarasov, Alexander Nikulin, Dmitry Akimov, Vladislav Kurenkov, Sergey Kolesnikov

CORL is an open-source library that provides thoroughly benchmarked single-file implementations of both deep offline and offline-to-online reinforcement learning algorithms.

Benchmarking D4RL +2

MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned

no code implementations17 Feb 2022 Anssi Kanervisto, Stephanie Milani, Karolis Ramanauskas, Nicholay Topin, Zichuan Lin, Junyou Li, Jianing Shi, Deheng Ye, Qiang Fu, Wei Yang, Weijun Hong, Zhongyue Huang, Haicheng Chen, Guangjun Zeng, Yue Lin, Vincent Micheli, Eloi Alonso, François Fleuret, Alexander Nikulin, Yury Belousov, Oleg Svidchenko, Aleksei Shpilman

With this in mind, we hosted the third edition of the MineRL ObtainDiamond competition, MineRL Diamond 2021, with a separate track in which we permitted any solution to promote the participation of newcomers.

Cannot find the paper you are looking for? You can Submit a new open access paper.