no code implementations • ICML 2020 • Alexander Vezhnevets, Yuhuai Wu, Maria Eckstein, Rémi Leblond, Joel Z. Leibo
This paper investigates generalisation in multi-agent games, where the generality of the agent can be evaluated by playing against opponents it hasn't seen during training.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 6 Sep 2023 • Aliya Amirova, Theodora Fteropoulli, Nafiso Ahmed, Martin R. Cowie, Joel Z. Leibo
Here we consider whether artificial "silicon participants" generated by LLMs may be productively studied using qualitative methods aiming to produce insights that could generalize to real human populations.
no code implementations • 29 May 2023 • Yiran Mao, Madeline G. Reinecke, Markus Kunesch, Edgar A. Duéñez-Guzmán, Ramona Comanescu, Julia Haas, Joel Z. Leibo
Is it possible to evaluate the moral cognition of complex artificial agents?
no code implementations • 1 May 2023 • Udari Madhushani, Kevin R. McKee, John P. Agapiou, Joel Z. Leibo, Richard Everett, Thomas Anthony, Edward Hughes, Karl Tuyls, Edgar A. Duéñez-Guzmán
In social psychology, Social Value Orientation (SVO) describes an individual's propensity to allocate resources between themself and others.
no code implementations • 2 Feb 2023 • Peter Sunehag, Alexander Sasha Vezhnevets, Edgar Duéñez-Guzmán, Igor Mordach, Joel Z. Leibo
The algorithm we propose consists of two parts: an agent architecture and a learning rule.
1 code implementation • 24 Nov 2022 • John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo
Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios.
no code implementations • 10 Aug 2022 • Anil Yaman, Joel Z. Leibo, Giovanni Iacca, Sang Wan Lee
Here we show that by introducing a model of social norms, which we regard as emergent patterns of decentralized social sanctioning, it becomes possible for groups of self-interested individuals to learn a productive division of labor involving all critical roles.
no code implementations • 13 May 2022 • Michael Bradley Johanson, Edward Hughes, Finbarr Timbers, Joel Z. Leibo
Agents learn to produce resources in a spatially complex world, trade them with one another, and consume those that they prefer.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 5 Jan 2022 • Kavya Kopparapu, Edgar A. Duéñez-Guzmán, Jayd Matyas, Alexander Sasha Vezhnevets, John P. Agapiou, Kevin R. McKee, Richard Everett, Janusz Marecki, Joel Z. Leibo, Thore Graepel
A key challenge in the study of multiagent cooperation is the need for individual agents not only to cooperate effectively, but to decide with whom to cooperate.
no code implementations • 21 Oct 2021 • Edgar A. Duéñez-Guzmán, Kevin R. McKee, Yiran Mao, Ben Coppin, Silvia Chiappa, Alexander Sasha Vezhnevets, Michiel A. Bakker, Yoram Bachrach, Suzanne Sadedin, William Isaac, Karl Tuyls, Joel Z. Leibo
Undesired bias afflicts both human and algorithmic decision making, and may be especially prevalent when information processing trade-offs incentivize the use of heuristics.
no code implementations • 14 Jul 2021 • Joel Z. Leibo, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, John P. Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charles Beattie, Igor Mordatch, Thore Graepel
Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks).
Multi-agent Reinforcement Learning
reinforcement-learning
+1
1 code implementation • 18 Jun 2021 • Anil Yaman, Nicolas Bredeche, Onur Çaylak, Joel Z. Leibo, Sang Wan Lee
Based on these findings, we hypothesized that meta-control of individual and social learning strategies provides effective and sample-efficient learning in volatile and uncertain environments.
no code implementations • 8 Mar 2021 • Kevin R. McKee, Edward Hughes, Tina O. Zhu, Martin J. Chadwick, Raphael Koster, Antonio Garcia Castaneda, Charlie Beattie, Thore Graepel, Matt Botvinick, Joel Z. Leibo
Collective action demands that individuals efficiently coordinate how much, where, and when to cooperate.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 16 Feb 2021 • Kevin R. McKee, Joel Z. Leibo, Charlie Beattie, Richard Everett
Generalization is a major challenge for multi-agent reinforcement learning.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 13 Feb 2021 • Michiel A. Bakker, Richard Everett, Laura Weidinger, Iason Gabriel, William S. Isaac, Joel Z. Leibo, Edward Hughes
Such systems have local incentives for individuals, whose behavior has an impact on the global outcome for the group.
no code implementations • 15 Dec 2020 • Allan Dafoe, Edward Hughes, Yoram Bachrach, Tantum Collins, Kevin R. McKee, Joel Z. Leibo, Kate Larson, Thore Graepel
We see opportunity to more explicitly focus on the problem of cooperation, to construct unified theory and vocabulary, and to build bridges with adjacent communities working on cooperation, including in the natural, social, and behavioural sciences.
1 code implementation • 13 Nov 2020 • Charles Beattie, Thomas Köppe, Edgar A. Duéñez-Guzmán, Joel Z. Leibo
We present DeepMind Lab2D, a scalable environment simulator for artificial intelligence research that facilitates researcher-led experimentation with environment design.
no code implementations • ICLR 2019 • Yoram Bachrach, Richard Everett, Edward Hughes, Angeliki Lazaridou, Joel Z. Leibo, Marc Lanctot, Michael Johanson, Wojciech M. Czarnecki, Thore Graepel
When autonomous agents interact in the same environment, they must often cooperate to achieve their goals.
no code implementations • 27 Feb 2020 • Edward Hughes, Thomas W. Anthony, Tom Eccles, Joel Z. Leibo, David Balduzzi, Yoram Bachrach
Here we argue that a systematic study of many-player zero-sum games is a crucial element of artificial intelligence research.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 6 Feb 2020 • Kevin R. McKee, Ian Gemp, Brian McWilliams, Edgar A. Duéñez-Guzmán, Edward Hughes, Joel Z. Leibo
Recent research on reinforcement learning in pure-conflict and pure-common interest games has emphasized the importance of population heterogeneity.
2 code implementations • 25 Jan 2020 • Raphael Köster, Dylan Hadfield-Menell, Gillian K. Hadfield, Joel Z. Leibo
How can societies learn to enforce and comply with social norms?
no code implementations • ICLR 2020 • David Balduzzi, Wojciech M. Czarnecki, Thomas W. Anthony, Ian M Gemp, Edward Hughes, Joel Z. Leibo, Georgios Piliouras, Thore Graepel
With the success of modern machine learning, it is becoming increasingly important to understand and control how learning algorithms interact.
1 code implementation • NeurIPS 2019 • Meire Fortunato, Melissa Tan, Ryan Faulkner, Steven Hansen, Adrià Puigdomènech Badia, Gavin Buttimore, Charlie Deck, Joel Z. Leibo, Charles Blundell
In this paper, we aim to develop a comprehensive methodology to test different kinds of memory in an agent and assess how well the agent can apply what it learns in training to a holdout set that differs from the training set along dimensions that we suggest are relevant for evaluating memory-specific generalization.
no code implementations • 4 Jun 2019 • Alexander Sasha Vezhnevets, Yuhuai Wu, Remi Leblond, Joel Z. Leibo
This paper investigates generalisation in multi-agent games, where the generality of the agent can be evaluated by playing against opponents it hasn't seen during training.
Multi-agent Reinforcement Learning
Reinforcement Learning (RL)
1 code implementation • NeurIPS 2019 • Ben Deverett, Ryan Faulkner, Meire Fortunato, Greg Wayne, Joel Z. Leibo
The measurement of time is central to intelligent behavior.
no code implementations • ICLR 2019 • Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas
Therefore, we also employ influence to train agents to use an explicit communication channel, and find that it leads to more effective communication and higher collective reward.
no code implementations • 19 Mar 2019 • Tom Eccles, Edward Hughes, János Kramár, Steven Wheelwright, Joel Z. Leibo
We analyse the resulting policies to show that the reciprocating agents are strongly influenced by their co-players' behavior.
no code implementations • 2 Mar 2019 • Joel Z. Leibo, Edward Hughes, Marc Lanctot, Thore Graepel
Evolution has produced a multi-scale mosaic of interacting adaptive units.
no code implementations • 17 Dec 2018 • Joel Z. Leibo, Julien Perolat, Edward Hughes, Steven Wheelwright, Adam H. Marblestone, Edgar Duéñez-Guzmán, Peter Sunehag, Iain Dunning, Thore Graepel
Here we explore a new algorithmic framework for multi-agent reinforcement learning, called Malthusian reinforcement learning, which extends self-play to include fitness-linked population size dynamics that drive ongoing innovation.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 14 Nov 2018 • Jane. X. Wang, Edward Hughes, Chrisantha Fernando, Wojciech M. Czarnecki, Edgar A. Duenez-Guzman, Joel Z. Leibo
Multi-agent cooperation is an important feature of the natural world.
Multiagent Systems
3 code implementations • ICLR 2019 • Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas
We propose a unified mechanism for achieving coordination and communication in Multi-Agent Reinforcement Learning (MARL), through rewarding agents for having causal influence over other agents' actions.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 3 Jul 2018 • Max Jaderberg, Wojciech M. Czarnecki, Iain Dunning, Luke Marris, Guy Lever, Antonio Garcia Castaneda, Charles Beattie, Neil C. Rabinowitz, Ari S. Morcos, Avraham Ruderman, Nicolas Sonnerat, Tim Green, Louise Deason, Joel Z. Leibo, David Silver, Demis Hassabis, Koray Kavukcuoglu, Thore Graepel
Recent progress in artificial intelligence through reinforcement learning (RL) has shown great success on increasingly complex single-agent environments and two-player turn-based games.
1 code implementation • ICLR 2018 • Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z. Leibo, Karl Tuyls, Stephen Clark
We also study communication behaviour in a setting where one agent interacts with agents in a community with different levels of prosociality and show how agent identifiability can aid negotiation.
1 code implementation • 28 Mar 2018 • Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botvinick, Demis Hassabis, Timothy Lillicrap
Animals execute goal-directed behaviours despite the limited range and scope of their sensors.
3 code implementations • NeurIPS 2018 • Edward Hughes, Joel Z. Leibo, Matthew G. Phillips, Karl Tuyls, Edgar A. Duéñez-Guzmán, Antonio García Castañeda, Iain Dunning, Tina Zhu, Kevin R. McKee, Raphael Koster, Heather Roff, Thore Graepel
Groups of humans are often able to find ways to cooperate with one another in complex, temporally extended social dilemmas.
no code implementations • 10 Mar 2018 • Simon Schmitt, Jonathan J. Hudson, Augustin Zidek, Simon Osindero, Carl Doersch, Wojciech M. Czarnecki, Joel Z. Leibo, Heinrich Kuttler, Andrew Zisserman, Karen Simonyan, S. M. Ali Eslami
Our method places no constraints on the architecture of the teacher or student agents, and it regulates itself to allow the students to surpass their teachers in performance.
1 code implementation • 24 Jan 2018 • Joel Z. Leibo, Cyprien de Masson d'Autume, Daniel Zoran, David Amos, Charles Beattie, Keith Anderson, Antonio García Castañeda, Manuel Sanchez, Simon Green, Audrunas Gruslys, Shane Legg, Demis Hassabis, Matthew M. Botvinick
Psychlab is a simulated psychology laboratory inside the first-person 3D game world of DeepMind Lab (Beattie et al. 2016).
4 code implementations • NeurIPS 2017 • Julien Perolat, Joel Z. Leibo, Vinicius Zambaldi, Charles Beattie, Karl Tuyls, Thore Graepel
Here we show that deep reinforcement learning can be used instead.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
7 code implementations • 16 Jun 2017 • Peter Sunehag, Guy Lever, Audrunas Gruslys, Wojciech Marian Czarnecki, Vinicius Zambaldi, Max Jaderberg, Marc Lanctot, Nicolas Sonnerat, Joel Z. Leibo, Karl Tuyls, Thore Graepel
We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal.
Ranked #1 on
SMAC+
on Off_Superhard_parallel
Multi-agent Reinforcement Learning
reinforcement-learning
+2
5 code implementations • 12 Apr 2017 • Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou, Joel Z. Leibo, Audrunas Gruslys
We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration data and is able to automatically assess the necessary ratio of demonstration data while learning thanks to a prioritized replay mechanism.
4 code implementations • 10 Feb 2017 • Joel Z. Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, Thore Graepel
We introduce sequential social dilemmas that share the mixed incentive structure of matrix game social dilemmas but also require agents to learn policies that implement their strategic intentions.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
4 code implementations • 12 Dec 2016 • Charles Beattie, Joel Z. Leibo, Denis Teplyashin, Tom Ward, Marcus Wainwright, Heinrich Küttler, Andrew Lefrancq, Simon Green, Víctor Valdés, Amir Sadik, Julian Schrittwieser, Keith Anderson, Sarah York, Max Cant, Adam Cain, Adrian Bolton, Stephen Gaffney, Helen King, Demis Hassabis, Shane Legg, Stig Petersen
DeepMind Lab is a first-person 3D game platform designed for research and development of general artificial intelligence and machine learning systems.
8 code implementations • 17 Nov 2016 • Jane. X. Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z. Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick
We unpack these points in a series of seven proof-of-concept experiments, each of which examines a key aspect of deep meta-RL.
3 code implementations • 16 Nov 2016 • Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z. Leibo, David Silver, Koray Kavukcuoglu
We also introduce a novel mechanism for focusing this representation upon extrinsic rewards, so that learning can rapidly adapt to the most relevant aspects of the actual task.
4 code implementations • NeurIPS 2016 • Jimmy Ba, Geoffrey Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin Ionescu
Until recently, research on artificial neural networks was largely restricted to systems with only two types of variable: Neural activities that represent the current or recent input and weights that learn to capture regularities among inputs, outputs and payoffs.
3 code implementations • 14 Jun 2016 • Charles Blundell, Benigno Uria, Alexander Pritzel, Yazhe Li, Avraham Ruderman, Joel Z. Leibo, Jack Rae, Daan Wierstra, Demis Hassabis
State of the art deep reinforcement learning algorithms take many millions of interactions to attain human-level performance.
no code implementations • 5 Jun 2016 • Joel Z. Leibo, Qianli Liao, Winrich Freiwald, Fabio Anselmi, Tomaso Poggio
The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and relatively robust against identity-preserving transformations like depth-rotations.
no code implementations • 28 Dec 2015 • Joel Z. Leibo, Julien Cornebise, Sergio Gómez, Demis Hassabis
This paper describes a framework for modeling the interface between perception and memory on the algorithmic level of analysis.
2 code implementations • 17 Oct 2015 • Qianli Liao, Joel Z. Leibo, Tomaso Poggio
Gradient backpropagation (BP) requires symmetric feedforward and feedback connections -- the same weights must be used for forward and backward passes.
Ranked #1 on
Handwritten Digit Recognition
on MNIST
(PERCENTAGE ERROR metric)
no code implementations • 12 Sep 2014 • Qianli Liao, Joel Z. Leibo, Tomaso Poggio
Populations of neurons in inferotemporal cortex (IT) maintain an explicit code for object identity that also tolerates transformations of object appearance e. g., position, scale, viewing angle [1, 2, 3].
no code implementations • NeurIPS 2013 • Qianli Liao, Joel Z. Leibo, Tomaso Poggio
Next, we apply the model to non-affine transformations: as expected, it performs well on face verification tasks requiring invariance to the relatively smooth transformations of 3D rotation-in-depth and changes in illumination direction.
no code implementations • 17 Nov 2013 • Fabio Anselmi, Joel Z. Leibo, Lorenzo Rosasco, Jim Mutch, Andrea Tacchetti, Tomaso Poggio
It also suggests that the main computational goal of the ventral stream of visual cortex is to provide a hierarchical representation of new objects/images which is invariant to transformations, stable, and discriminative for recognition---and that this representation may be continuously learned in an unsupervised way during development and visual experience.
no code implementations • 16 Nov 2013 • Qianli Liao, Joel Z. Leibo, Youssef Mroueh, Tomaso Poggio
The standard approach to unconstrained face recognition in natural photographs is via a detection, alignment, recognition pipeline.
no code implementations • NeurIPS 2011 • Joel Z. Leibo, Jim Mutch, Tomaso Poggio
Many studies have uncovered evidence that visual cortex contains specialized regions involved in processing faces but not other object classes.