no code implementations • ICML 2020 • Sai Ganesh Nagarajan, David Balduzzi, Georgios Piliouras
Even simple games learning dynamics can yield chaotic behavior.
no code implementations • 8 Oct 2021 • Marta Garnelo, Wojciech Marian Czarnecki, SiQi Liu, Dhruva Tirumala, Junhyuk Oh, Gauthier Gidel, Hado van Hasselt, David Balduzzi
Strategic diversity is often essential in games: in multi-player games, for example, evaluating a player against a diverse set of strategies will yield a more accurate estimate of its performance.
1 code implementation • NeurIPS 2020 • Wojciech Marian Czarnecki, Gauthier Gidel, Brendan Tracey, Karl Tuyls, Shayegan Omidshafiei, David Balduzzi, Max Jaderberg
This paper investigates the geometrical properties of real world games (e. g. Tic-Tac-Toe, Go, StarCraft II).
no code implementations • 27 Feb 2020 • Edward Hughes, Thomas W. Anthony, Tom Eccles, Joel Z. Leibo, David Balduzzi, Yoram Bachrach
Here we argue that a systematic study of many-player zero-sum games is a crucial element of artificial intelligence research.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 19 Feb 2020 • Julien Perolat, Remi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls
In this paper we investigate the Follow the Regularized Leader dynamics in sequential imperfect information games (IIG).
no code implementations • 14 Feb 2020 • Gauthier Gidel, David Balduzzi, Wojciech Marian Czarnecki, Marta Garnelo, Yoram Bachrach
Adversarial training, a special case of multi-objective optimization, is an increasingly prevalent machine learning technique: some of its most notable applications include GAN-based generative modeling and self-play techniques in reinforcement learning which have been applied to complex games such as Go or Poker.
no code implementations • ICLR 2020 • David Balduzzi, Wojciech M. Czarnecki, Thomas W. Anthony, Ian M Gemp, Edward Hughes, Joel Z. Leibo, Georgios Piliouras, Thore Graepel
With the success of modern machine learning, it is becoming increasingly important to understand and control how learning algorithms interact.
1 code implementation • 2 Dec 2019 • Yan Wu, Jeff Donahue, David Balduzzi, Karen Simonyan, Timothy Lillicrap
Training generative adversarial networks requires balancing of delicate adversarial dynamics.
no code implementations • 25 Sep 2019 • Yoram Bachrach, Tor Lattimore, Marta Garnelo, Julien Perolat, David Balduzzi, Thomas Anthony, Satinder Singh, Thore Graepel
We show that MARL converges to the desired outcome if the rewards are designed so that exerting effort is the iterated dominance solution, but fails if it is merely a Nash equilibrium.
1 code implementation • 13 May 2019 • Alistair Letcher, David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel
The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in differentiable games.
no code implementations • 23 Jan 2019 • David Balduzzi, Marta Garnelo, Yoram Bachrach, Wojciech M. Czarnecki, Julien Perolat, Max Jaderberg, Thore Graepel
Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'.
no code implementations • ICLR 2019 • Alistair Letcher, Jakob Foerster, David Balduzzi, Tim Rocktäschel, Shimon Whiteson
A growing number of learning methods are actually differentiable games whose players optimise multiple, interdependent objectives in parallel -- from GANs and intrinsic curiosity to multi-agent RL.
2 code implementations • NeurIPS 2018 • David Balduzzi, Karl Tuyls, Julien Perolat, Thore Graepel
Progress in machine learning is measured by careful evaluation on problems of outstanding common interest.
1 code implementation • ICML 2018 • David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel
The first is related to potential games, which reduce to gradient descent on an implicit function; the second relates to Hamiltonian games, a new class of games that obey a conservation law, akin to conservation laws in classical mechanical systems.
1 code implementation • ICML 2017 • David Balduzzi, Marcus Frean, Lennox Leary, JP Lewis, Kurt Wan-Duo Ma, Brian McWilliams
A long-standing obstacle to progress in deep learning is the problem of vanishing and exploding gradients.
no code implementations • ICML 2017 • David Balduzzi
As artificial agents proliferate, it is becoming increasingly important to ensure that their interactions with one another are well-behaved.
no code implementations • ICML 2017 • David Balduzzi, Brian McWilliams, Tony Butler-Yeoman
Modern convolutional networks, incorporating rectifiers and max-pooling, are neither smooth nor convex; standard guarantees therefore do not apply.
2 code implementations • 12 Jul 2016 • Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, David Balduzzi, Wen Li
In this paper, we propose a novel unsupervised domain adaptation algorithm based on deep learning for visual object recognition.
no code implementations • 7 Apr 2016 • David Balduzzi
Corollaries of the main result include: (i) a game-theoretic description of the representations learned by a neural network; (ii) a logarithmic-regret algorithm for training neural nets; and (iii) a formal setting for analyzing conditional computation in neural nets that can be applied to recently developed models of attention.
no code implementations • 9 Feb 2016 • Nicolás Della Penna, Mark D. Reid, David Balduzzi
Motivated by clinical trials, we study bandits with observable non-compliance.
no code implementations • 6 Feb 2016 • David Balduzzi, Muhammad Ghifary
This paper imports ideas from physics and functional programming into RNN design to provide guiding principles.
no code implementations • 15 Oct 2015 • Muhammad Ghifary, David Balduzzi, W. Bastiaan Kleijn, Mengjie Zhang
We propose Scatter Component Analyis (SCA), a fast representation learning algorithm that can be applied to both domain adaptation and domain generalization.
Ranked #7 on
Domain Adaptation
on Office-Caltech
no code implementations • 29 Sep 2015 • David Balduzzi
Deep learning is currently the subject of intensive study.
no code implementations • 10 Sep 2015 • David Balduzzi, Muhammad Ghifary
Firstly, we present a temporal-difference based method for learning the gradient of the value-function.
no code implementations • 6 Sep 2015 • David Balduzzi
The main result is that error backpropagation on a convolutional network is equivalent to playing out a circadian game.
3 code implementations • ICCV 2015 • Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, David Balduzzi
The problem of domain generalization is to take knowledge acquired from a number of related domains where training data is available, and to then successfully apply it to previously unseen domains.
no code implementations • 23 Nov 2014 • David Balduzzi, Hastagiri Vanchinathan, Joachim Buhmann
Using these results, we derive a new credit assignment algorithm for nonparametric regression, Kickback, that is significantly simpler than Backprop.
no code implementations • 28 Aug 2014 • David Balduzzi
The paper demonstrates that falsifiability is fundamental to learning.
no code implementations • 7 Jan 2014 • David Balduzzi
We investigate cortical learning from the perspective of mechanism design.
no code implementations • 24 Oct 2013 • David Balduzzi
Despite its size and complexity, the human cortex exhibits striking anatomical regularities, suggesting there may simple meta-algorithms underlying cortical learning and computation.
1 code implementation • Proceedings of Machine Learning Research 2013 • Krikamol Muandet, David Balduzzi, Bernhard Schölkopf
This paper investigates domain generalization: How to take knowledge acquired from an arbitrary number of related domains and apply it to previously unseen domains?
no code implementations • NeurIPS 2013 • Brian McWilliams, David Balduzzi, Joachim M. Buhmann
Random views are justified by recent theoretical and empirical work showing that regression with random features closely approximates kernel regression, implying that random views can be expected to contain accurate estimators.
no code implementations • NeurIPS 2012 • David Balduzzi, Michel Besserve
This paper suggests a learning-theoretic perspective on how synaptic plasticity benefits global brain functioning.
no code implementations • NeurIPS 2012 • Pedro Ortega, Jordi Grau-Moya, Tim Genewein, David Balduzzi, Daniel Braun
We propose a novel Bayesian approach to solve stochastic optimization problems that involve finding extrema of noisy, nonlinear functions.
no code implementations • 29 Mar 2012 • Dominik Janzing, David Balduzzi, Moritz Grosse-Wentrup, Bernhard Schölkopf
Here we propose a set of natural, intuitive postulates that a measure of causal strength should satisfy.
Statistics Theory Statistics Theory
no code implementations • 3 May 2011 • Manuel Gomez Rodriguez, David Balduzzi, Bernhard Schölkopf
Time plays an essential role in the diffusion of information, influence and disease over networks.
no code implementations • 1 May 2011 • David Balduzzi
Many natural processes occur over characteristic spatial and temporal scales.
Information Theory Information Theory Cellular Automata and Lattice Gases Neurons and Cognition