Search Results for author: Jakub Grudzien Kuba

Found 11 papers, 8 papers with code

Functional Graphical Models: Structure Enables Offline Data-Driven Optimization

no code implementations • 8 Jan 2024 • Jakub Grudzien Kuba, Masatoshi Uehara, Pieter Abbeel, Sergey Levine

This kind of data-driven optimization (DDO) presents a range of challenges beyond those in standard prediction problems, since we need models that successfully predict the performance of new designs that are better than the best designs seen in the training set.

Paper
Add Code

IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies

1 code implementation • 20 Apr 2023 • Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine

In this paper, we reinterpret IQL as an actor-critic method by generalizing the critic objective and connecting it to a behavior-regularized implicit actor.

Offline RL Q-Learning

Paper
Code

Heterogeneous-Agent Reinforcement Learning

1 code implementation • 19 Apr 2023 • Yifan Zhong, Jakub Grudzien Kuba, Xidong Feng, Siyi Hu, Jiaming Ji, Yaodong Yang

The necessity for cooperation among intelligent machines has popularised cooperative multi-agent reinforcement learning (MARL) in AI research.

LEMMA Multi-agent Reinforcement Learning +1

348

Paper
Code

Discovered Policy Optimisation

1 code implementation • 11 Oct 2022 • Chris Lu, Jakub Grudzien Kuba, Alistair Letcher, Luke Metz, Christian Schroeder de Witt, Jakob Foerster

We refer to the immediate result as Learnt Policy Optimisation (LPO).

Meta-Learning Reinforcement Learning (RL)

559

Paper
Code

Heterogeneous-Agent Mirror Learning: A Continuum of Solutions to Cooperative MARL

no code implementations • 2 Aug 2022 • Jakub Grudzien Kuba, Xidong Feng, Shiyao Ding, Hao Dong, Jun Wang, Yaodong Yang

The necessity for cooperation among intelligent machines has popularised cooperative multi-agent reinforcement learning (MARL) in the artificial intelligence (AI) research community.

Multi-agent Reinforcement Learning

Paper
Add Code

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

1 code implementation • 30 May 2022 • Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang

In this paper, we introduce a novel architecture named Multi-Agent Transformer (MAT) that effectively casts cooperative multi-agent reinforcement learning (MARL) into SM problems wherein the task is to map agents' observation sequence to agents' optimal action sequence.

Decision Making Multi-agent Reinforcement Learning +2

273

Paper
Code

Understanding Value Decomposition Algorithms in Deep Cooperative Multi-Agent Reinforcement Learning

no code implementations • 10 Feb 2022 • Zehao Dou, Jakub Grudzien Kuba, Yaodong Yang

Value function decomposition is becoming a popular rule of thumb for scaling up multi-agent reinforcement learning (MARL) in cooperative games.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Mirror Learning: A Unifying Framework of Policy Optimisation

1 code implementation • 7 Jan 2022 • Jakub Grudzien Kuba, Christian Schroeder de Witt, Jakob Foerster

In contrast, in this paper we introduce a novel theoretical framework, named Mirror Learning, which provides theoretical guarantees to a large class of algorithms, including TRPO and PPO.

Reinforcement Learning (RL)

Paper
Code

Multi-Agent Constrained Policy Optimisation

3 code implementations • 6 Oct 2021 • Shangding Gu, Jakub Grudzien Kuba, Munning Wen, Ruiqing Chen, Ziyan Wang, Zheng Tian, Jun Wang, Alois Knoll, Yaodong Yang

To fill these gaps, in this work, we formulate the safe MARL problem as a constrained Markov game and solve it with policy optimisation methods.

Multi-agent Reinforcement Learning reinforcement-learning +1

124

Paper
Code

Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning

7 code implementations • ICLR 2022 • Jakub Grudzien Kuba, Ruiqing Chen, Muning Wen, Ying Wen, Fanglei Sun, Jun Wang, Yaodong Yang

In this paper, we extend the theory of trust region learning to MARL.

LEMMA Multi-agent Reinforcement Learning +2

2,548

Paper
Code

Settling the Variance of Multi-Agent Policy Gradients

1 code implementation • NeurIPS 2021 • Jakub Grudzien Kuba, Muning Wen, Yaodong Yang, Linghui Meng, Shangding Gu, Haifeng Zhang, David Henry Mguni, Jun Wang

In multi-agent RL (MARL), although the PG theorem can be naturally extended, the effectiveness of multi-agent PG (MAPG) methods degrades as the variance of gradient estimates increases rapidly with the number of agents.

Reinforcement Learning (RL) Starcraft

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.