Sample efficiency is crucial for imitation learning methods to be applicable in real-world applications.
High sample complexity remains a barrier to the application of reinforcement learning (RL), particularly in multi-agent systems.
A key challenge on the path to developing agents that learn complex human-like behavior is the need to quickly and accurately quantify human-likeness.
no code implementations • 26 Jan 2021 • William H. Guss, Mario Ynocente Castro, Sam Devlin, Brandon Houghton, Noboru Sean Kuno, Crissman Loomis, Stephanie Milani, Sharada Mohanty, Keisuke Nakata, Ruslan Salakhutdinov, John Schulman, Shinya Shiroshita, Nicholay Topin, Avinash Ummadisingu, Oriol Vinyals
Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development.
We apply this methodology to build a suite of unit tests for the Overcooked-AI environment, and use this test suite to evaluate three proposals for improving robustness.
The optimal adaptive behaviour under uncertainty over the other agents' strategies w. r. t.
Policy gradient methods have become one of the most popular classes of algorithms for multi-agent reinforcement learning.
We compare with literature from the research community that address the challenges identified and conclude by highlighting promising directions for future research supporting agent creation in the games industry.
Variational inference (VI) plays an essential role in approximate Bayesian inference due to its computational efficiency and broad applicability.
Our second contribution is a unifying mathematical formulation for learning latent relations.
In many partially observable scenarios, Reinforcement Learning (RL) agents must rely on long-term memory in order to learn an optimal policy.
Game-playing Evolutionary Algorithms, specifically Rolling Horizon Evolutionary Algorithms, have recently managed to beat the state of the art in win rate across many video games.
We discuss those differences and propose modifications to existing regularization techniques in order to better adapt them to RL.
Variational inference (VI) plays an essential role in approximate Bayesian inference due to its computational efficiency and general applicability.
These are learning time, scalability and decentralised coordination i. e. no communication between the learning agents.
Learning in multi-agent scenarios is a fruitful research direction, but current approaches still show scalability problems in multiple games with general reward settings and different opponent types.
In 2016, 2017, and 2018 at the IEEE Conference on Computational Intelligence in Games, the authors of this paper ran a competition for agents that can play classic text-based adventure games.
Many deep reinforcement learning approaches use graphical state representations, this means visually distinct games that share the same underlying structure cannot effectively share knowledge.
Given the comparatively limited supply of professional data, a key question is thus whether mixed-rank match datasets can be used to create data-driven models which predict winners in professional matches and provide a simple in-game statistic for viewers and broadcasters.