Search Results for author: Jakub Grudzien Kuba

Found 11 papers, 8 papers with code

Functional Graphical Models: Structure Enables Offline Data-Driven Optimization

no code implementations8 Jan 2024 Jakub Grudzien Kuba, Masatoshi Uehara, Pieter Abbeel, Sergey Levine

This kind of data-driven optimization (DDO) presents a range of challenges beyond those in standard prediction problems, since we need models that successfully predict the performance of new designs that are better than the best designs seen in the training set.

IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies

1 code implementation20 Apr 2023 Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine

In this paper, we reinterpret IQL as an actor-critic method by generalizing the critic objective and connecting it to a behavior-regularized implicit actor.

Offline RL Q-Learning

Heterogeneous-Agent Reinforcement Learning

1 code implementation19 Apr 2023 Yifan Zhong, Jakub Grudzien Kuba, Xidong Feng, Siyi Hu, Jiaming Ji, Yaodong Yang

The necessity for cooperation among intelligent machines has popularised cooperative multi-agent reinforcement learning (MARL) in AI research.

LEMMA Multi-agent Reinforcement Learning +1

Heterogeneous-Agent Mirror Learning: A Continuum of Solutions to Cooperative MARL

no code implementations2 Aug 2022 Jakub Grudzien Kuba, Xidong Feng, Shiyao Ding, Hao Dong, Jun Wang, Yaodong Yang

The necessity for cooperation among intelligent machines has popularised cooperative multi-agent reinforcement learning (MARL) in the artificial intelligence (AI) research community.

Multi-agent Reinforcement Learning

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

1 code implementation30 May 2022 Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang

In this paper, we introduce a novel architecture named Multi-Agent Transformer (MAT) that effectively casts cooperative multi-agent reinforcement learning (MARL) into SM problems wherein the task is to map agents' observation sequence to agents' optimal action sequence.

Decision Making Multi-agent Reinforcement Learning +2

Understanding Value Decomposition Algorithms in Deep Cooperative Multi-Agent Reinforcement Learning

no code implementations10 Feb 2022 Zehao Dou, Jakub Grudzien Kuba, Yaodong Yang

Value function decomposition is becoming a popular rule of thumb for scaling up multi-agent reinforcement learning (MARL) in cooperative games.

Multi-agent Reinforcement Learning reinforcement-learning +1

Mirror Learning: A Unifying Framework of Policy Optimisation

1 code implementation7 Jan 2022 Jakub Grudzien Kuba, Christian Schroeder de Witt, Jakob Foerster

In contrast, in this paper we introduce a novel theoretical framework, named Mirror Learning, which provides theoretical guarantees to a large class of algorithms, including TRPO and PPO.

Reinforcement Learning (RL)

Multi-Agent Constrained Policy Optimisation

3 code implementations6 Oct 2021 Shangding Gu, Jakub Grudzien Kuba, Munning Wen, Ruiqing Chen, Ziyan Wang, Zheng Tian, Jun Wang, Alois Knoll, Yaodong Yang

To fill these gaps, in this work, we formulate the safe MARL problem as a constrained Markov game and solve it with policy optimisation methods.

Multi-agent Reinforcement Learning reinforcement-learning +1

Settling the Variance of Multi-Agent Policy Gradients

1 code implementation NeurIPS 2021 Jakub Grudzien Kuba, Muning Wen, Yaodong Yang, Linghui Meng, Shangding Gu, Haifeng Zhang, David Henry Mguni, Jun Wang

In multi-agent RL (MARL), although the PG theorem can be naturally extended, the effectiveness of multi-agent PG (MAPG) methods degrades as the variance of gradient estimates increases rapidly with the number of agents.

Reinforcement Learning (RL) Starcraft

Cannot find the paper you are looking for? You can Submit a new open access paper.