Search Results for author: Shayegan Omidshafiei

Found 23 papers, 7 papers with code

Fast computation of Nash Equilibria in Imperfect Information Games

no code implementations ICML 2020 Remi Munos, Julien Perolat, Jean-Baptiste Lespiau, Mark Rowland, Bart De Vylder, Marc Lanctot, Finbarr Timbers, Daniel Hennes, Shayegan Omidshafiei, Audrunas Gruslys, Mohammad Gheshlaghi Azar, Edward Lockhart, Karl Tuyls

We introduce and analyze a class of algorithms, called Mirror Ascent against an Improved Opponent (MAIO), for computing Nash equilibria in two-player zero-sum games, both in normal form and in sequential imperfect information form.

Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis

no code implementations17 Jun 2022 Shayegan Omidshafiei, Andrei Kapishnikov, Yannick Assogba, Lucas Dixon, Been Kim

Each year, expert-level performance is attained in increasingly-complex multiagent domains, notable examples including Go, Poker, and StarCraft II.

Starcraft Starcraft II +1

Evolutionary Dynamics and $Φ$-Regret Minimization in Games

no code implementations28 Jun 2021 Georgios Piliouras, Mark Rowland, Shayegan Omidshafiei, Romuald Elie, Daniel Hennes, Jerome Connor, Karl Tuyls

Importantly, $\Phi$-regret enables learning agents to consider deviations from and to mixed strategies, generalizing several existing notions of regret such as external, internal, and swap regret, and thus broadening the insights gained from regret-based analysis of learning algorithms.

online learning

From Motor Control to Team Play in Simulated Humanoid Football

1 code implementation25 May 2021 SiQi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds.

Decision Making Imitation Learning +2

Navigating the Landscape of Multiplayer Games

no code implementations4 May 2020 Shayegan Omidshafiei, Karl Tuyls, Wojciech M. Czarnecki, Francisco C. Santos, Mark Rowland, Jerome Connor, Daniel Hennes, Paul Muller, Julien Perolat, Bart De Vylder, Audrunas Gruslys, Remi Munos

Multiplayer games have long been used as testbeds in artificial intelligence research, aptly referred to as the Drosophila of artificial intelligence.

Multiagent Evaluation under Incomplete Information

1 code implementation NeurIPS 2019 Mark Rowland, Shayegan Omidshafiei, Karl Tuyls, Julien Perolat, Michal Valko, Georgios Piliouras, Remi Munos

This paper investigates the evaluation of learned multiagent strategies in the incomplete information setting, which plays a critical role in ranking and training of agents.

Neural Replicator Dynamics

1 code implementation1 Jun 2019 Daniel Hennes, Dustin Morrill, Shayegan Omidshafiei, Remi Munos, Julien Perolat, Marc Lanctot, Audrunas Gruslys, Jean-Baptiste Lespiau, Paavo Parmas, Edgar Duenez-Guzman, Karl Tuyls

Policy gradient and actor-critic algorithms form the basis of many commonly used training techniques in deep reinforcement learning.

Policy Gradient Methods

Policy Distillation and Value Matching in Multiagent Reinforcement Learning

no code implementations15 Mar 2019 Samir Wadhwania, Dong-Ki Kim, Shayegan Omidshafiei, Jonathan P. How

Multiagent reinforcement learning algorithms (MARL) have been demonstrated on complex tasks that require the coordination of a team of multiple agents to complete.

reinforcement-learning

α-Rank: Multi-Agent Evaluation by Evolution

1 code implementation4 Mar 2019 Shayegan Omidshafiei, Christos Papadimitriou, Georgios Piliouras, Karl Tuyls, Mark Rowland, Jean-Baptiste Lespiau, Wojciech M. Czarnecki, Marc Lanctot, Julien Perolat, Remi Munos

We introduce {\alpha}-Rank, a principled evolutionary dynamics methodology, for the evaluation and ranking of agents in large-scale multi-agent interactions, grounded in a novel dynamical game-theoretic solution concept called Markov-Conley chains (MCCs).

Mathematical Proofs

Learning to Teach in Cooperative Multiagent Reinforcement Learning

no code implementations20 May 2018 Shayegan Omidshafiei, Dong-Ki Kim, Miao Liu, Gerald Tesauro, Matthew Riemer, Christopher Amato, Murray Campbell, Jonathan P. How

The problem of teaching to improve agent learning has been investigated by prior works, but these approaches make assumptions that prevent application of teaching to general multiagent problems, or require domain expertise for problems they can apply to.

reinforcement-learning

Crossmodal Attentive Skill Learner

no code implementations28 Nov 2017 Shayegan Omidshafiei, Dong-Ki Kim, Jason Pazis, Jonathan P. How

This paper presents the Crossmodal Attentive Skill Learner (CASL), integrated with the recently-introduced Asynchronous Advantage Option-Critic (A2OC) architecture [Harb et al., 2017] to enable hierarchical reinforcement learning across multiple sensory inputs.

Atari Games Hierarchical Reinforcement Learning +1

Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions

no code implementations24 Jul 2017 Miao Liu, Kavinayan Sivakumar, Shayegan Omidshafiei, Christopher Amato, Jonathan P. How

We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment.

Decision Making Decision Making Under Uncertainty

Hierarchical Bayesian Noise Inference for Robust Real-time Probabilistic Object Classification

no code implementations3 May 2016 Shayegan Omidshafiei, Brett T. Lopez, Jonathan P. How, John Vian

This paper presents an approach for filtering sequences of object classification probabilities using online modeling of the noise characteristics of the classifier outputs.

Classification Decision Making +3

Decentralized Control of Partially Observable Markov Decision Processes using Belief Space Macro-actions

no code implementations20 Feb 2015 Shayegan Omidshafiei, Ali-akbar Agha-mohammadi, Christopher Amato, Jonathan P. How

To allow for a high-level representation that is natural for multi-robot problems and scalable to large discrete and continuous problems, this paper extends the Dec-POMDP model to the decentralized partially observable semi-Markov decision process (Dec-POSMDP).

Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.