Search Results for author: Joel Z. Leibo

Found 48 papers, 20 papers with code

OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning

no code implementations ICML 2020 Alexander Vezhnevets, Yuhuai Wu, Maria Eckstein, Rémi Leblond, Joel Z. Leibo

This paper investigates generalisation in multi-agent games, where the generality of the agent can be evaluated by playing against opponents it hasn't seen during training.

Multi-agent Reinforcement Learning reinforcement-learning

Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning

no code implementations13 May 2022 Michael Bradley Johanson, Edward Hughes, Finbarr Timbers, Joel Z. Leibo

Agents learn to produce resources in a spatially complex world, trade them with one another, and consume those that they prefer.

Multi-agent Reinforcement Learning reinforcement-learning

Hidden Agenda: a Social Deduction Game with Diverse Learned Equilibria

no code implementations5 Jan 2022 Kavya Kopparapu, Edgar A. Duéñez-Guzmán, Jayd Matyas, Alexander Sasha Vezhnevets, John P. Agapiou, Kevin R. McKee, Richard Everett, Janusz Marecki, Joel Z. Leibo, Thore Graepel

A key challenge in the study of multiagent cooperation is the need for individual agents not only to cooperate effectively, but to decide with whom to cooperate.

reinforcement-learning

Statistical discrimination in learning agents

no code implementations21 Oct 2021 Edgar A. Duéñez-Guzmán, Kevin R. McKee, Yiran Mao, Ben Coppin, Silvia Chiappa, Alexander Sasha Vezhnevets, Michiel A. Bakker, Yoram Bachrach, Suzanne Sadedin, William Isaac, Karl Tuyls, Joel Z. Leibo

Undesired bias afflicts both human and algorithmic decision making, and may be especially prevalent when information processing trade-offs incentivize the use of heuristics.

Decision Making Multi-agent Reinforcement Learning

Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

no code implementations14 Jul 2021 Joel Z. Leibo, Edgar Duéñez-Guzmán, Alexander Sasha Vezhnevets, John P. Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charles Beattie, Igor Mordatch, Thore Graepel

Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks).

Multi-agent Reinforcement Learning reinforcement-learning

Meta-control of social learning strategies

no code implementations18 Jun 2021 Anil Yaman, Nicolas Bredeche, Onur Çaylak, Joel Z. Leibo, Sang Wan Lee

Based on these findings, we hypothesized that meta-control of individual and social learning strategies provides effective and sample-efficient learning in volatile and uncertain environments.

Open Problems in Cooperative AI

no code implementations15 Dec 2020 Allan Dafoe, Edward Hughes, Yoram Bachrach, Tantum Collins, Kevin R. McKee, Joel Z. Leibo, Kate Larson, Thore Graepel

We see opportunity to more explicitly focus on the problem of cooperation, to construct unified theory and vocabulary, and to build bridges with adjacent communities working on cooperation, including in the natural, social, and behavioural sciences.

DeepMind Lab2D

1 code implementation13 Nov 2020 Charles Beattie, Thomas Köppe, Edgar A. Duéñez-Guzmán, Joel Z. Leibo

We present DeepMind Lab2D, a scalable environment simulator for artificial intelligence research that facilitates researcher-led experimentation with environment design.

reinforcement-learning

Social diversity and social preferences in mixed-motive reinforcement learning

no code implementations6 Feb 2020 Kevin R. McKee, Ian Gemp, Brian McWilliams, Edgar A. Duéñez-Guzmán, Edward Hughes, Joel Z. Leibo

Recent research on reinforcement learning in pure-conflict and pure-common interest games has emphasized the importance of population heterogeneity.

reinforcement-learning

Smooth markets: A basic mechanism for organizing gradient-based learners

no code implementations ICLR 2020 David Balduzzi, Wojciech M. Czarnecki, Thomas W. Anthony, Ian M Gemp, Edward Hughes, Joel Z. Leibo, Georgios Piliouras, Thore Graepel

With the success of modern machine learning, it is becoming increasingly important to understand and control how learning algorithms interact.

Generalization of Reinforcement Learners with Working and Episodic Memory

1 code implementation NeurIPS 2019 Meire Fortunato, Melissa Tan, Ryan Faulkner, Steven Hansen, Adrià Puigdomènech Badia, Gavin Buttimore, Charlie Deck, Joel Z. Leibo, Charles Blundell

In this paper, we aim to develop a comprehensive methodology to test different kinds of memory in an agent and assess how well the agent can apply what it learns in training to a holdout set that differs from the training set along dimensions that we suggest are relevant for evaluating memory-specific generalization.

Options as responses: Grounding behavioural hierarchies in multi-agent RL

1 code implementation4 Jun 2019 Alexander Sasha Vezhnevets, Yuhuai Wu, Remi Leblond, Joel Z. Leibo

This paper investigates generalisation in multi-agent games, where the generality of the agent can be evaluated by playing against opponents it hasn't seen during training.

Multi-agent Reinforcement Learning reinforcement-learning

Intrinsic Social Motivation via Causal Influence in Multi-Agent RL

no code implementations ICLR 2019 Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas

Therefore, we also employ influence to train agents to use an explicit communication channel, and find that it leads to more effective communication and higher collective reward.

Multi-agent Reinforcement Learning

Learning Reciprocity in Complex Sequential Social Dilemmas

no code implementations19 Mar 2019 Tom Eccles, Edward Hughes, János Kramár, Steven Wheelwright, Joel Z. Leibo

We analyse the resulting policies to show that the reciprocating agents are strongly influenced by their co-players' behavior.

reinforcement-learning

Malthusian Reinforcement Learning

no code implementations17 Dec 2018 Joel Z. Leibo, Julien Perolat, Edward Hughes, Steven Wheelwright, Adam H. Marblestone, Edgar Duéñez-Guzmán, Peter Sunehag, Iain Dunning, Thore Graepel

Here we explore a new algorithmic framework for multi-agent reinforcement learning, called Malthusian reinforcement learning, which extends self-play to include fitness-linked population size dynamics that drive ongoing innovation.

Multi-agent Reinforcement Learning reinforcement-learning

Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning

3 code implementations ICLR 2019 Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas

We propose a unified mechanism for achieving coordination and communication in Multi-Agent Reinforcement Learning (MARL), through rewarding agents for having causal influence over other agents' actions.

Multi-agent Reinforcement Learning reinforcement-learning

Emergent Communication through Negotiation

1 code implementation ICLR 2018 Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z. Leibo, Karl Tuyls, Stephen Clark

We also study communication behaviour in a setting where one agent interacts with agents in a community with different levels of prosociality and show how agent identifiability can aid negotiation.

Multi-agent Reinforcement Learning reinforcement-learning

Kickstarting Deep Reinforcement Learning

no code implementations10 Mar 2018 Simon Schmitt, Jonathan J. Hudson, Augustin Zidek, Simon Osindero, Carl Doersch, Wojciech M. Czarnecki, Joel Z. Leibo, Heinrich Kuttler, Andrew Zisserman, Karen Simonyan, S. M. Ali Eslami

Our method places no constraints on the architecture of the teacher or student agents, and it regulates itself to allow the students to surpass their teachers in performance.

reinforcement-learning

Deep Q-learning from Demonstrations

5 code implementations12 Apr 2017 Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou, Joel Z. Leibo, Audrunas Gruslys

We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration data and is able to automatically assess the necessary ratio of demonstration data while learning thanks to a prioritized replay mechanism.

14 Decision Making +2

Multi-agent Reinforcement Learning in Sequential Social Dilemmas

3 code implementations10 Feb 2017 Joel Z. Leibo, Vinicius Zambaldi, Marc Lanctot, Janusz Marecki, Thore Graepel

We introduce sequential social dilemmas that share the mixed incentive structure of matrix game social dilemmas but also require agents to learn policies that implement their strategic intentions.

Multi-agent Reinforcement Learning reinforcement-learning

DeepMind Lab

4 code implementations12 Dec 2016 Charles Beattie, Joel Z. Leibo, Denis Teplyashin, Tom Ward, Marcus Wainwright, Heinrich Küttler, Andrew Lefrancq, Simon Green, Víctor Valdés, Amir Sadik, Julian Schrittwieser, Keith Anderson, Sarah York, Max Cant, Adam Cain, Adrian Bolton, Stephen Gaffney, Helen King, Demis Hassabis, Shane Legg, Stig Petersen

DeepMind Lab is a first-person 3D game platform designed for research and development of general artificial intelligence and machine learning systems.

Learning to reinforcement learn

7 code implementations17 Nov 2016 Jane. X. Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z. Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick

We unpack these points in a series of seven proof-of-concept experiments, each of which examines a key aspect of deep meta-RL.

Meta-Learning Meta Reinforcement Learning +1

Reinforcement Learning with Unsupervised Auxiliary Tasks

3 code implementations16 Nov 2016 Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z. Leibo, David Silver, Koray Kavukcuoglu

We also introduce a novel mechanism for focusing this representation upon extrinsic rewards, so that learning can rapidly adapt to the most relevant aspects of the actual task.

reinforcement-learning

Using Fast Weights to Attend to the Recent Past

4 code implementations NeurIPS 2016 Jimmy Ba, Geoffrey Hinton, Volodymyr Mnih, Joel Z. Leibo, Catalin Ionescu

Until recently, research on artificial neural networks was largely restricted to systems with only two types of variable: Neural activities that represent the current or recent input and weights that learn to capture regularities among inputs, outputs and payoffs.

Model-Free Episodic Control

3 code implementations14 Jun 2016 Charles Blundell, Benigno Uria, Alexander Pritzel, Yazhe Li, Avraham Ruderman, Joel Z. Leibo, Jack Rae, Daan Wierstra, Demis Hassabis

State of the art deep reinforcement learning algorithms take many millions of interactions to attain human-level performance.

Decision Making Hippocampus +1

View-tolerant face recognition and Hebbian learning imply mirror-symmetric neural tuning to head orientation

no code implementations5 Jun 2016 Joel Z. Leibo, Qianli Liao, Winrich Freiwald, Fabio Anselmi, Tomaso Poggio

The primate brain contains a hierarchy of visual areas, dubbed the ventral stream, which rapidly computes object representations that are both specific for object identity and relatively robust against identity-preserving transformations like depth-rotations.

Face Recognition Object Recognition

Approximate Hubel-Wiesel Modules and the Data Structures of Neural Computation

no code implementations28 Dec 2015 Joel Z. Leibo, Julien Cornebise, Sergio Gómez, Demis Hassabis

This paper describes a framework for modeling the interface between perception and memory on the algorithmic level of analysis.

Hippocampus

How Important is Weight Symmetry in Backpropagation?

2 code implementations17 Oct 2015 Qianli Liao, Joel Z. Leibo, Tomaso Poggio

Gradient backpropagation (BP) requires symmetric feedforward and feedback connections -- the same weights must be used for forward and backward passes.

 Ranked #1 on Handwritten Digit Recognition on MNIST (PERCENTAGE ERROR metric)

Handwritten Digit Recognition Image Classification

Unsupervised learning of clutter-resistant visual representations from natural videos

no code implementations12 Sep 2014 Qianli Liao, Joel Z. Leibo, Tomaso Poggio

Populations of neurons in inferotemporal cortex (IT) maintain an explicit code for object identity that also tolerates transformations of object appearance e. g., position, scale, viewing angle [1, 2, 3].

Face Recognition

Learning invariant representations and applications to face verification

no code implementations NeurIPS 2013 Qianli Liao, Joel Z. Leibo, Tomaso Poggio

Next, we apply the model to non-affine transformations: as expected, it performs well on face verification tasks requiring invariance to the relatively smooth transformations of 3D rotation-in-depth and changes in illumination direction.

Face Verification Object Recognition

Unsupervised Learning of Invariant Representations in Hierarchical Architectures

no code implementations17 Nov 2013 Fabio Anselmi, Joel Z. Leibo, Lorenzo Rosasco, Jim Mutch, Andrea Tacchetti, Tomaso Poggio

It also suggests that the main computational goal of the ventral stream of visual cortex is to provide a hierarchical representation of new objects/images which is invariant to transformations, stable, and discriminative for recognition---and that this representation may be continuously learned in an unsupervised way during development and visual experience.

Object Recognition Speech Recognition

Can a biologically-plausible hierarchy effectively replace face detection, alignment, and recognition pipelines?

no code implementations16 Nov 2013 Qianli Liao, Joel Z. Leibo, Youssef Mroueh, Tomaso Poggio

The standard approach to unconstrained face recognition in natural photographs is via a detection, alignment, recognition pipeline.

Face Detection Face Recognition

Why The Brain Separates Face Recognition From Object Recognition

no code implementations NeurIPS 2011 Joel Z. Leibo, Jim Mutch, Tomaso Poggio

Many studies have uncovered evidence that visual cortex contains specialized regions involved in processing faces but not other object classes.

Face Identification Face Recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.