Search Results for author: Peter Sunehag

Found 12 papers, 4 papers with code

A Review of Cooperation in Multi-agent Learning

no code implementations8 Dec 2023 Yali Du, Joel Z. Leibo, Usman Islam, Richard Willis, Peter Sunehag

Cooperation in multi-agent learning (MAL) is a topic at the intersection of numerous disciplines, including game theory, economics, social sciences, and evolutionary biology.

Decision Making

Melting Pot 2.0

2 code implementations24 Nov 2022 John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo

Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios.

Artificial Life Navigate

Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

no code implementations14 Jul 2021 Joel Z. Leibo, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, John P. Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charles Beattie, Igor Mordatch, Thore Graepel

Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks).

Multi-agent Reinforcement Learning reinforcement-learning +1

Learning to Incentivize Other Learning Agents

2 code implementations NeurIPS 2020 Jiachen Yang, Ang Li, Mehrdad Farajtabar, Peter Sunehag, Edward Hughes, Hongyuan Zha

The challenge of developing powerful and general Reinforcement Learning (RL) agents has received increasing attention in recent years.

General Reinforcement Learning Reinforcement Learning (RL)

Malthusian Reinforcement Learning

no code implementations17 Dec 2018 Joel Z. Leibo, Julien Perolat, Edward Hughes, Steven Wheelwright, Adam H. Marblestone, Edgar Duéñez-Guzmán, Peter Sunehag, Iain Dunning, Thore Graepel

Here we explore a new algorithmic framework for multi-agent reinforcement learning, called Malthusian reinforcement learning, which extends self-play to include fitness-linked population size dynamics that drive ongoing innovation.

Multi-agent Reinforcement Learning reinforcement-learning +1

Deep Reinforcement Learning in Large Discrete Action Spaces

2 code implementations24 Dec 2015 Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin

Being able to reason in an environment with a large number of discrete actions is essential to bringing reinforcement learning to a larger class of problems.

Recommendation Systems reinforcement-learning +1

Deep Reinforcement Learning with Attention for Slate Markov Decision Processes with High-Dimensional States and Actions

no code implementations3 Dec 2015 Peter Sunehag, Richard Evans, Gabriel Dulac-Arnold, Yori Zwols, Daniel Visentin, Ben Coppin

Further, we use deep deterministic policy gradients to learn a policy that for each position of the slate, guides attention towards the part of the action space in which the value is the highest and we only evaluate actions in this area.

Q-Learning Recommendation Systems

The Sample-Complexity of General Reinforcement Learning

no code implementations22 Aug 2013 Tor Lattimore, Marcus Hutter, Peter Sunehag

We present a new algorithm for general reinforcement learning where the true environment is known to belong to a finite class of N arbitrary models.

General Reinforcement Learning reinforcement-learning +1

On Nicod's Condition, Rules of Induction and the Raven Paradox

no code implementations12 Jul 2013 Hadi Mohasel Afshar, Peter Sunehag

For us, "(objective) background knowledge" is restricted to information that can be expressed as probability events.

Concentration and Confidence for Discrete Bayesian Sequence Predictors

no code implementations29 Jun 2013 Tor Lattimore, Marcus Hutter, Peter Sunehag

We prove tight high-probability bounds on the cumulative error, which is measured in terms of the Kullback-Leibler (KL) divergence.

Cannot find the paper you are looking for? You can Submit a new open access paper.