Search Results for author: DJ Strouse

Found 12 papers, 6 papers with code

Melting Pot 2.0

1 code implementation24 Nov 2022 John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo

Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios.

Artificial Life Navigate

In-context Reinforcement Learning with Algorithm Distillation

no code implementations25 Oct 2022 Michael Laskin, Luyu Wang, Junhyuk Oh, Emilio Parisotto, Stephen Spencer, Richie Steigerwald, DJ Strouse, Steven Hansen, Angelos Filos, Ethan Brooks, Maxime Gazeau, Himanshu Sahni, Satinder Singh, Volodymyr Mnih

We propose Algorithm Distillation (AD), a method for distilling reinforcement learning (RL) algorithms into neural networks by modeling their training histories with a causal sequence model.


Collaborating with Humans without Human Data

1 code implementation NeurIPS 2021 DJ Strouse, Kevin R. McKee, Matt Botvinick, Edward Hughes, Richard Everett

Here, we study the problem of how to train agents that collaborate well with human partners without using human data.

Multi-agent Reinforcement Learning

Learning more skills through optimistic exploration

no code implementations ICLR 2022 DJ Strouse, Kate Baumli, David Warde-Farley, Vlad Mnih, Steven Hansen

However, an inherent exploration problem lingers: when a novel state is actually encountered, the discriminator will necessarily not have seen enough training data to produce accurate and confident skill classifications, leading to low intrinsic reward for the agent and effective penalization of the sort of exploration needed to actually maximize the objective.

Learning Truthful, Efficient, and Welfare Maximizing Auction Rules

no code implementations11 Jul 2019 Andrea Tacchetti, DJ Strouse, Marta Garnelo, Thore Graepel, Yoram Bachrach

From social networks to supply chains, more and more aspects of how humans, firms and organizations interact is mediated by artificial learning agents.

Transfer and Exploration via the Information Bottleneck

no code implementations ICLR 2019 Anirudh Goyal, Riashat Islam, DJ Strouse, Zafarali Ahmed, Hugo Larochelle, Matthew Botvinick, Yoshua Bengio, Sergey Levine

In new environments, this model can then identify novel subgoals for further exploration, guiding the agent through a sequence of potential decision states and through new regions of the state space.

Intrinsic Social Motivation via Causal Influence in Multi-Agent RL

no code implementations ICLR 2019 Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas

Therefore, we also employ influence to train agents to use an explicit communication channel, and find that it leads to more effective communication and higher collective reward.

Inductive Bias Multi-agent Reinforcement Learning

Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning

3 code implementations ICLR 2019 Natasha Jaques, Angeliki Lazaridou, Edward Hughes, Caglar Gulcehre, Pedro A. Ortega, DJ Strouse, Joel Z. Leibo, Nando de Freitas

We propose a unified mechanism for achieving coordination and communication in Multi-Agent Reinforcement Learning (MARL), through rewarding agents for having causal influence over other agents' actions.

Multi-agent Reinforcement Learning reinforcement-learning +1

The information bottleneck and geometric clustering

1 code implementation27 Dec 2017 DJ Strouse, David J. Schwab

The information bottleneck (IB) approach to clustering takes a joint distribution $P\!\left(X, Y\right)$ and maps the data $X$ to cluster labels $T$ which retain maximal information about $Y$ (Tishby et al., 1999).

Clustering Model Selection

The deterministic information bottleneck

2 code implementations1 Apr 2016 DJ Strouse, David J. Schwab

Here, we introduce an alternative formulation that replaces mutual information with entropy, which we call the deterministic information bottleneck (DIB), that we argue better captures this notion of compression.


Cannot find the paper you are looking for? You can Submit a new open access paper.