no code implementations • ICML 2020 • Sai Ganesh Nagarajan, David Balduzzi, Georgios Piliouras

Even simple games learning dynamics can yield chaotic behavior.

no code implementations • 8 Oct 2021 • Marta Garnelo, Wojciech Marian Czarnecki, SiQi Liu, Dhruva Tirumala, Junhyuk Oh, Gauthier Gidel, Hado van Hasselt, David Balduzzi

Strategic diversity is often essential in games: in multi-player games, for example, evaluating a player against a diverse set of strategies will yield a more accurate estimate of its performance.

1 code implementation • NeurIPS 2020 • Wojciech Marian Czarnecki, Gauthier Gidel, Brendan Tracey, Karl Tuyls, Shayegan Omidshafiei, David Balduzzi, Max Jaderberg

This paper investigates the geometrical properties of real world games (e. g. Tic-Tac-Toe, Go, StarCraft II).

no code implementations • 27 Feb 2020 • Edward Hughes, Thomas W. Anthony, Tom Eccles, Joel Z. Leibo, David Balduzzi, Yoram Bachrach

Here we argue that a systematic study of many-player zero-sum games is a crucial element of artificial intelligence research.

Multi-agent Reinforcement Learning
reinforcement-learning
**+2**

no code implementations • 19 Feb 2020 • Julien Perolat, Remi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls

In this paper we investigate the Follow the Regularized Leader dynamics in sequential imperfect information games (IIG).

no code implementations • 14 Feb 2020 • Gauthier Gidel, David Balduzzi, Wojciech Marian Czarnecki, Marta Garnelo, Yoram Bachrach

Adversarial training, a special case of multi-objective optimization, is an increasingly prevalent machine learning technique: some of its most notable applications include GAN-based generative modeling and self-play techniques in reinforcement learning which have been applied to complex games such as Go or Poker.

no code implementations • ICLR 2020 • David Balduzzi, Wojciech M. Czarnecki, Thomas W. Anthony, Ian M Gemp, Edward Hughes, Joel Z. Leibo, Georgios Piliouras, Thore Graepel

With the success of modern machine learning, it is becoming increasingly important to understand and control how learning algorithms interact.

1 code implementation • 2 Dec 2019 • Yan Wu, Jeff Donahue, David Balduzzi, Karen Simonyan, Timothy Lillicrap

Training generative adversarial networks requires balancing of delicate adversarial dynamics.

no code implementations • 25 Sep 2019 • Yoram Bachrach, Tor Lattimore, Marta Garnelo, Julien Perolat, David Balduzzi, Thomas Anthony, Satinder Singh, Thore Graepel

We show that MARL converges to the desired outcome if the rewards are designed so that exerting effort is the iterated dominance solution, but fails if it is merely a Nash equilibrium.

1 code implementation • 13 May 2019 • Alistair Letcher, David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel

The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in differentiable games.

no code implementations • 23 Jan 2019 • David Balduzzi, Marta Garnelo, Yoram Bachrach, Wojciech M. Czarnecki, Julien Perolat, Max Jaderberg, Thore Graepel

Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'.

no code implementations • ICLR 2019 • Alistair Letcher, Jakob Foerster, David Balduzzi, Tim Rocktäschel, Shimon Whiteson

A growing number of learning methods are actually differentiable games whose players optimise multiple, interdependent objectives in parallel -- from GANs and intrinsic curiosity to multi-agent RL.

2 code implementations • NeurIPS 2018 • David Balduzzi, Karl Tuyls, Julien Perolat, Thore Graepel

Progress in machine learning is measured by careful evaluation on problems of outstanding common interest.

1 code implementation • ICML 2018 • David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel

The first is related to potential games, which reduce to gradient descent on an implicit function; the second relates to Hamiltonian games, a new class of games that obey a conservation law, akin to conservation laws in classical mechanical systems.

1 code implementation • ICML 2017 • David Balduzzi, Marcus Frean, Lennox Leary, JP Lewis, Kurt Wan-Duo Ma, Brian McWilliams

A long-standing obstacle to progress in deep learning is the problem of vanishing and exploding gradients.

no code implementations • ICML 2017 • David Balduzzi

As artificial agents proliferate, it is becoming increasingly important to ensure that their interactions with one another are well-behaved.

no code implementations • ICML 2017 • David Balduzzi, Brian McWilliams, Tony Butler-Yeoman

Modern convolutional networks, incorporating rectifiers and max-pooling, are neither smooth nor convex; standard guarantees therefore do not apply.

2 code implementations • 12 Jul 2016 • Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, David Balduzzi, Wen Li

In this paper, we propose a novel unsupervised domain adaptation algorithm based on deep learning for visual object recognition.

no code implementations • 7 Apr 2016 • David Balduzzi

Corollaries of the main result include: (i) a game-theoretic description of the representations learned by a neural network; (ii) a logarithmic-regret algorithm for training neural nets; and (iii) a formal setting for analyzing conditional computation in neural nets that can be applied to recently developed models of attention.

no code implementations • 9 Feb 2016 • Nicolás Della Penna, Mark D. Reid, David Balduzzi

Motivated by clinical trials, we study bandits with observable non-compliance.

no code implementations • 6 Feb 2016 • David Balduzzi, Muhammad Ghifary

This paper imports ideas from physics and functional programming into RNN design to provide guiding principles.

no code implementations • 15 Oct 2015 • Muhammad Ghifary, David Balduzzi, W. Bastiaan Kleijn, Mengjie Zhang

We propose Scatter Component Analyis (SCA), a fast representation learning algorithm that can be applied to both domain adaptation and domain generalization.

Ranked #7 on Domain Adaptation on Office-Caltech

no code implementations • 29 Sep 2015 • David Balduzzi

Deep learning is currently the subject of intensive study.

no code implementations • 10 Sep 2015 • David Balduzzi, Muhammad Ghifary

Firstly, we present a temporal-difference based method for learning the gradient of the value-function.

no code implementations • 6 Sep 2015 • David Balduzzi

The main result is that error backpropagation on a convolutional network is equivalent to playing out a circadian game.

3 code implementations • ICCV 2015 • Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, David Balduzzi

The problem of domain generalization is to take knowledge acquired from a number of related domains where training data is available, and to then successfully apply it to previously unseen domains.

no code implementations • 23 Nov 2014 • David Balduzzi, Hastagiri Vanchinathan, Joachim Buhmann

Using these results, we derive a new credit assignment algorithm for nonparametric regression, Kickback, that is significantly simpler than Backprop.

no code implementations • 28 Aug 2014 • David Balduzzi

The paper demonstrates that falsifiability is fundamental to learning.

no code implementations • 7 Jan 2014 • David Balduzzi

We investigate cortical learning from the perspective of mechanism design.

no code implementations • 24 Oct 2013 • David Balduzzi

Despite its size and complexity, the human cortex exhibits striking anatomical regularities, suggesting there may simple meta-algorithms underlying cortical learning and computation.

1 code implementation • Proceedings of Machine Learning Research 2013 • Krikamol Muandet, David Balduzzi, Bernhard Schölkopf

This paper investigates domain generalization: How to take knowledge acquired from an arbitrary number of related domains and apply it to previously unseen domains?

no code implementations • NeurIPS 2013 • Brian McWilliams, David Balduzzi, Joachim M. Buhmann

Random views are justified by recent theoretical and empirical work showing that regression with random features closely approximates kernel regression, implying that random views can be expected to contain accurate estimators.

no code implementations • NeurIPS 2012 • David Balduzzi, Michel Besserve

This paper suggests a learning-theoretic perspective on how synaptic plasticity benefits global brain functioning.

no code implementations • NeurIPS 2012 • Pedro Ortega, Jordi Grau-Moya, Tim Genewein, David Balduzzi, Daniel Braun

We propose a novel Bayesian approach to solve stochastic optimization problems that involve ﬁnding extrema of noisy, nonlinear functions.

no code implementations • 29 Mar 2012 • Dominik Janzing, David Balduzzi, Moritz Grosse-Wentrup, Bernhard Schölkopf

Here we propose a set of natural, intuitive postulates that a measure of causal strength should satisfy.

Statistics Theory Statistics Theory

no code implementations • 3 May 2011 • Manuel Gomez Rodriguez, David Balduzzi, Bernhard Schölkopf

Time plays an essential role in the diffusion of information, influence and disease over networks.

no code implementations • 1 May 2011 • David Balduzzi

Many natural processes occur over characteristic spatial and temporal scales.

Information Theory Information Theory Cellular Automata and Lattice Gases Neurons and Cognition

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.