no code implementations • 20 Feb 2025 • Deepak Nathani, Lovish Madaan, Nicholas Roberts, Nikolay Bashlykov, Ajay Menon, Vincent Moens, Amar Budhiraja, Despoina Magka, Vladislav Vorotilov, Gaurav Chaurasia, Dieuwke Hupkes, Ricardo Silveira Cabral, Tatiana Shavrina, Jakob Foerster, Yoram Bachrach, William Yang Wang, Roberta Raileanu
Our MLGym framework makes it easy to add new tasks, integrate and evaluate models or agents, generate synthetic data at scale, as well as develop new learning algorithms for training agents on AI research tasks.
no code implementations • 11 Feb 2025 • Anna C. M. Thöni, William E. Robinson, Yoram Bachrach, Wilhelm T. S. Huck, Tal Kachman
In chemical reaction network theory, ordinary differential equations are used to model the temporal change of chemical species concentration.
no code implementations • 31 Oct 2024 • Marc Lanctot, Kate Larson, Michael Kaisers, Quentin Berthet, Ian Gemp, Manfred Diaz, Roberto-Rafael Maura-Rivero, Yoram Bachrach, Anna Koop, Doina Precup
This optimal ranking is the maximum likelihood estimate when evaluation data (which we view as votes) are interpreted as noisy samples from a ground truth ranking, a solution to Condorcet's original voting system criteria.
1 code implementation • 24 Jan 2024 • Ian Gemp, Roma Patel, Yoram Bachrach, Marc Lanctot, Vibhavari Dasagi, Luke Marris, Georgios Piliouras, SiQi Liu, Karl Tuyls
Specifically, by modelling the players, strategies and payoffs in a "game" of dialogue, we create a binding from natural language interactions to the conventional symbolic logic of game theory.
1 code implementation • 5 Dec 2023 • Marc Lanctot, Kate Larson, Yoram Bachrach, Luke Marris, Zun Li, Avishkar Bhoopchand, Thomas Anthony, Brian Tanner, Anna Koop
We argue that many general evaluation problems can be viewed through the lens of voting theory.
no code implementations • 17 Nov 2023 • Mauricio Diaz-Ortiz Jr, Benjamin Kempinski, Daphne Cornelisse, Yoram Bachrach, Tal Kachman
We show how solution concepts from cooperative game theory can be used to tackle the problem of pruning neural networks.
no code implementations • 16 Oct 2023 • Zhe Wang, Petar Veličković, Daniel Hennes, Nenad Tomašev, Laurel Prince, Michael Kaisers, Yoram Bachrach, Romuald Elie, Li Kevin Wenliang, Federico Piccinini, William Spearman, Ian Graham, Jerome Connor, Yi Yang, Adrià Recasens, Mina Khan, Nathalie Beauguerlange, Pablo Sprechmann, Pol Moreno, Nicolas Heess, Michael Bowling, Demis Hassabis, Karl Tuyls
The utility of TacticAI is validated by a qualitative study conducted with football domain experts at Liverpool FC.
1 code implementation • 25 May 2023 • Stefan Hödl, William Robinson, Yoram Bachrach, Wilhelm Huck, Tal Kachman
Explainability techniques are crucial in gaining insights into the reasons behind the predictions of deep learning models, which have not yet been applied to chemical language models.
1 code implementation • NeurIPS 2023 • Marco Jiralerspong, Avishek Joey Bose, Ian Gemp, Chongli Qin, Yoram Bachrach, Gauthier Gidel
The past few years have seen impressive progress in the development of deep generative models capable of producing high-dimensional, complex, and photo-realistic data.
no code implementations • 1 Feb 2023 • Zun Li, Marc Lanctot, Kevin R. McKee, Luke Marris, Ian Gemp, Daniel Hennes, Paul Muller, Kate Larson, Yoram Bachrach, Michael P. Wellman
Multiagent reinforcement learning (MARL) has benefited significantly from population-based and game-theoretic training regimes.
no code implementations • 22 Sep 2022 • Ian Gemp, Thomas Anthony, Yoram Bachrach, Avishkar Bhoopchand, Kalesha Bullard, Jerome Connor, Vibhavari Dasagi, Bart De Vylder, Edgar Duenez-Guzman, Romuald Elie, Richard Everett, Daniel Hennes, Edward Hughes, Mina Khan, Marc Lanctot, Kate Larson, Guy Lever, SiQi Liu, Luke Marris, Kevin R. McKee, Paul Muller, Julien Perolat, Florian Strub, Andrea Tacchetti, Eugene Tarassov, Zhe Wang, Karl Tuyls
The Game Theory & Multi-Agent team at DeepMind studies several aspects of multi-agent learning ranging from computing approximations to fundamental concepts in game theory to simulating social dilemmas in rich spatial environments and training 3-d humanoids in difficult team coordination tasks.
no code implementations • 18 Aug 2022 • Daphne Cornelisse, Thomas Rood, Mateusz Malinowski, Yoram Bachrach, Tal Kachman
Cooperative game theory offers solution concepts identifying distribution schemes, such as the Shapley value, that fairly reflect the contribution of individuals to the performance of the team or the Core, which reduces the incentive of agents to abandon their team.
no code implementations • 29 Jul 2022 • Elise van der Pol, Ian Gemp, Yoram Bachrach, Richard Everett
A core step of spectral clustering is performing an eigendecomposition of the corresponding graph Laplacian matrix (or equivalently, a singular value decomposition, SVD, of the incidence matrix).
1 code implementation • 13 Dec 2021 • Elizabeth Bondi, Raphael Koster, Hannah Sheahan, Martin Chadwick, Yoram Bachrach, Taylan Cemgil, Ulrich Paquet, Krishnamurthy Dvijotham
Using real-world conservation data and a selective prediction system that improves expected accuracy over that of the human or AI system working individually, we show that this messaging has a significant impact on the accuracy of human judgements.
no code implementations • 21 Oct 2021 • Edgar A. Duéñez-Guzmán, Kevin R. McKee, Yiran Mao, Ben Coppin, Silvia Chiappa, Alexander Sasha Vezhnevets, Michiel A. Bakker, Yoram Bachrach, Suzanne Sadedin, William Isaac, Karl Tuyls, Joel Z. Leibo
Undesired bias afflicts both human and algorithmic decision making, and may be especially prevalent when information processing trade-offs incentivize the use of heuristics.
no code implementations • NAACL 2021 • Roma Patel, Marta Garnelo, Ian Gemp, Chris Dyer, Yoram Bachrach
We propose a vocabulary selection method that views words as members of a team trying to maximize the model{'}s performance.
no code implementations • 15 Dec 2020 • Allan Dafoe, Edward Hughes, Yoram Bachrach, Tantum Collins, Kevin R. McKee, Joel Z. Leibo, Kate Larson, Thore Graepel
We see opportunity to more explicitly focus on the problem of cooperation, to construct unified theory and vocabulary, and to build bridges with adjacent communities working on cooperation, including in the natural, social, and behavioural sciences.
no code implementations • ICLR 2019 • Yoram Bachrach, Richard Everett, Edward Hughes, Angeliki Lazaridou, Joel Z. Leibo, Marc Lanctot, Michael Johanson, Wojciech M. Czarnecki, Thore Graepel
When autonomous agents interact in the same environment, they must often cooperate to achieve their goals.
2 code implementations • NeurIPS 2020 • Thomas Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Roman Werpachowski, Satinder Singh, Thore Graepel, Yoram Bachrach
It also features a large combinatorial action space and simultaneous moves, which are challenging for RL algorithms.
no code implementations • 27 Feb 2020 • Edward Hughes, Thomas W. Anthony, Tom Eccles, Joel Z. Leibo, David Balduzzi, Yoram Bachrach
Here we argue that a systematic study of many-player zero-sum games is a crucial element of artificial intelligence research.
Multi-agent Reinforcement Learning
reinforcement-learning
+2
no code implementations • 14 Feb 2020 • Gauthier Gidel, David Balduzzi, Wojciech Marian Czarnecki, Marta Garnelo, Yoram Bachrach
Adversarial training, a special case of multi-objective optimization, is an increasingly prevalent machine learning technique: some of its most notable applications include GAN-based generative modeling and self-play techniques in reinforcement learning which have been applied to complex games such as Go or Poker.
no code implementations • NeurIPS 2019 • Tom Eccles, Yoram Bachrach, Guy Lever, Angeliki Lazaridou, Thore Graepel
We study the problem of emergent communication, in which language arises because speakers and listeners must communicate information in order to solve tasks.
Multi-agent Reinforcement Learning
reinforcement-learning
+2
no code implementations • 25 Sep 2019 • Thomas Anthony, Ian Gemp, Janos Kramar, Tom Eccles, Andrea Tacchetti, Yoram Bachrach
In contrast to auctions designed manually by economists, our method searches the possible design space using a simulation of the multi-agent learning process, and can thus handle settings where a game-theoretic equilibrium analysis is not tractable.
no code implementations • 25 Sep 2019 • Yoram Bachrach, Tor Lattimore, Marta Garnelo, Julien Perolat, David Balduzzi, Thomas Anthony, Satinder Singh, Thore Graepel
We show that MARL converges to the desired outcome if the rewards are designed so that exerting effort is the iterated dominance solution, but fails if it is merely a Nash equilibrium.
no code implementations • 11 Jul 2019 • Andrea Tacchetti, DJ Strouse, Marta Garnelo, Thore Graepel, Yoram Bachrach
From social networks to supply chains, more and more aspects of how humans, firms and organizations interact is mediated by artificial learning agents.
no code implementations • 23 Jan 2019 • David Balduzzi, Marta Garnelo, Yoram Bachrach, Wojciech M. Czarnecki, Julien Perolat, Max Jaderberg, Thore Graepel
Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'.
1 code implementation • ACL 2018 • Andrej {\v{Z}}ukov-Gregori{\v{c}}, Yoram Bachrach, Sam Coope
We present a new architecture for named entity recognition.
no code implementations • ICLR 2018 • Sam Coope, Andrej Zukov-Gregoric, Yoram Bachrach
We propose a neural clustering model that jointly learns both latent features and how they cluster.
no code implementations • 5 Jul 2017 • Yoram Bachrach, Andrej Zukov-Gregoric, Sam Coope, Ed Tovell, Bogdan Maksak, Jose Rodriguez, Conan McMurtie
We propose a new attention mechanism for neural based question answering, which depends on varying granularities of the input.
no code implementations • 10 Feb 2017 • Kirthevasan Kandasamy, Yoram Bachrach, Ryota Tomioka, Daniel Tarlow, David Carter
We study reinforcement learning of chatbots with recurrent neural network architectures when the rewards are noisy and expensive to obtain.
no code implementations • 29 May 2016 • Yoad Lewenberg, Yoram Bachrach, Sukrit Shankar, Antonio Criminisi
We consider the task of predicting various traits of a person given an image of their face.