Search Results for author: David Balduzzi

Found 37 papers, 9 papers with code

Pick Your Battles: Interaction Graphs as Population-Level Objectives for Strategic Diversity

no code implementations8 Oct 2021 Marta Garnelo, Wojciech Marian Czarnecki, SiQi Liu, Dhruva Tirumala, Junhyuk Oh, Gauthier Gidel, Hado van Hasselt, David Balduzzi

Strategic diversity is often essential in games: in multi-player games, for example, evaluating a player against a diverse set of strategies will yield a more accurate estimate of its performance.

A Limited-Capacity Minimax Theorem for Non-Convex Games or: How I Learned to Stop Worrying about Mixed-Nash and Love Neural Nets

no code implementations14 Feb 2020 Gauthier Gidel, David Balduzzi, Wojciech Marian Czarnecki, Marta Garnelo, Yoram Bachrach

Adversarial training, a special case of multi-objective optimization, is an increasingly prevalent machine learning technique: some of its most notable applications include GAN-based generative modeling and self-play techniques in reinforcement learning which have been applied to complex games such as Go or Poker.

Starcraft Starcraft II

Smooth markets: A basic mechanism for organizing gradient-based learners

no code implementations ICLR 2020 David Balduzzi, Wojciech M. Czarnecki, Thomas W. Anthony, Ian M Gemp, Edward Hughes, Joel Z. Leibo, Georgios Piliouras, Thore Graepel

With the success of modern machine learning, it is becoming increasingly important to understand and control how learning algorithms interact.

Multiagent Reinforcement Learning in Games with an Iterated Dominance Solution

no code implementations25 Sep 2019 Yoram Bachrach, Tor Lattimore, Marta Garnelo, Julien Perolat, David Balduzzi, Thomas Anthony, Satinder Singh, Thore Graepel

We show that MARL converges to the desired outcome if the rewards are designed so that exerting effort is the iterated dominance solution, but fails if it is merely a Nash equilibrium.

reinforcement-learning

Differentiable Game Mechanics

1 code implementation13 May 2019 Alistair Letcher, David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel

The decomposition motivates Symplectic Gradient Adjustment (SGA), a new algorithm for finding stable fixed points in differentiable games.

Open-ended Learning in Symmetric Zero-sum Games

no code implementations23 Jan 2019 David Balduzzi, Marta Garnelo, Yoram Bachrach, Wojciech M. Czarnecki, Julien Perolat, Max Jaderberg, Thore Graepel

Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of agents, for example labeling them `winner' and `loser'.

Stable Opponent Shaping in Differentiable Games

no code implementations ICLR 2019 Alistair Letcher, Jakob Foerster, David Balduzzi, Tim Rocktäschel, Shimon Whiteson

A growing number of learning methods are actually differentiable games whose players optimise multiple, interdependent objectives in parallel -- from GANs and intrinsic curiosity to multi-agent RL.

Re-evaluating Evaluation

1 code implementation NeurIPS 2018 David Balduzzi, Karl Tuyls, Julien Perolat, Thore Graepel

Progress in machine learning is measured by careful evaluation on problems of outstanding common interest.

The Mechanics of n-Player Differentiable Games

1 code implementation ICML 2018 David Balduzzi, Sebastien Racaniere, James Martens, Jakob Foerster, Karl Tuyls, Thore Graepel

The first is related to potential games, which reduce to gradient descent on an implicit function; the second relates to Hamiltonian games, a new class of games that obey a conservation law, akin to conservation laws in classical mechanical systems.

Strongly-Typed Agents are Guaranteed to Interact Safely

no code implementations ICML 2017 David Balduzzi

As artificial agents proliferate, it is becoming increasingly important to ensure that their interactions with one another are well-behaved.

Common Sense Reasoning Tensor Decomposition

Neural Taylor Approximations: Convergence and Exploration in Rectifier Networks

no code implementations ICML 2017 David Balduzzi, Brian McWilliams, Tony Butler-Yeoman

Modern convolutional networks, incorporating rectifiers and max-pooling, are neither smooth nor convex; standard guarantees therefore do not apply.

Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation

2 code implementations12 Jul 2016 Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, David Balduzzi, Wen Li

In this paper, we propose a novel unsupervised domain adaptation algorithm based on deep learning for visual object recognition.

Classification General Classification +2

Deep Online Convex Optimization with Gated Games

no code implementations7 Apr 2016 David Balduzzi

Corollaries of the main result include: (i) a game-theoretic description of the representations learned by a neural network; (ii) a logarithmic-regret algorithm for training neural nets; and (iii) a formal setting for analyzing conditional computation in neural nets that can be applied to recently developed models of attention.

Compliance-Aware Bandits

no code implementations9 Feb 2016 Nicolás Della Penna, Mark D. Reid, David Balduzzi

Motivated by clinical trials, we study bandits with observable non-compliance.

Strongly-Typed Recurrent Neural Networks

no code implementations6 Feb 2016 David Balduzzi, Muhammad Ghifary

This paper imports ideas from physics and functional programming into RNN design to provide guiding principles.

Scatter Component Analysis: A Unified Framework for Domain Adaptation and Domain Generalization

no code implementations15 Oct 2015 Muhammad Ghifary, David Balduzzi, W. Bastiaan Kleijn, Mengjie Zhang

We propose Scatter Component Analyis (SCA), a fast representation learning algorithm that can be applied to both domain adaptation and domain generalization.

Domain Generalization General Classification +2

Compatible Value Gradients for Reinforcement Learning of Continuous Deep Policies

no code implementations10 Sep 2015 David Balduzzi, Muhammad Ghifary

Firstly, we present a temporal-difference based method for learning the gradient of the value-function.

reinforcement-learning

Deep Online Convex Optimization by Putting Forecaster to Sleep

no code implementations6 Sep 2015 David Balduzzi

The main result is that error backpropagation on a convolutional network is equivalent to playing out a circadian game.

Model Selection

Domain Generalization for Object Recognition with Multi-task Autoencoders

3 code implementations ICCV 2015 Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, David Balduzzi

The problem of domain generalization is to take knowledge acquired from a number of related domains where training data is available, and to then successfully apply it to previously unseen domains.

Denoising Domain Generalization +1

Kickback cuts Backprop's red-tape: Biologically plausible credit assignment in neural networks

no code implementations23 Nov 2014 David Balduzzi, Hastagiri Vanchinathan, Joachim Buhmann

Using these results, we derive a new credit assignment algorithm for nonparametric regression, Kickback, that is significantly simpler than Backprop.

Falsifiable implies Learnable

no code implementations28 Aug 2014 David Balduzzi

The paper demonstrates that falsifiability is fundamental to learning.

Cortical prediction markets

no code implementations7 Jan 2014 David Balduzzi

We investigate cortical learning from the perspective of mechanism design.

Randomized co-training: from cortical neurons to machine learning and back again

no code implementations24 Oct 2013 David Balduzzi

Despite its size and complexity, the human cortex exhibits striking anatomical regularities, suggesting there may simple meta-algorithms underlying cortical learning and computation.

Domain generalization via invariant feature representation

1 code implementation Proceedings of Machine Learning Research 2013 Krikamol Muandet, David Balduzzi, Bernhard Schölkopf

This paper investigates domain generalization: How to take knowledge acquired from an arbitrary number of related domains and apply it to previously unseen domains?

Domain Generalization

Correlated random features for fast semi-supervised learning

no code implementations NeurIPS 2013 Brian McWilliams, David Balduzzi, Joachim M. Buhmann

Random views are justified by recent theoretical and empirical work showing that regression with random features closely approximates kernel regression, implying that random views can be expected to contain accurate estimators.

Towards a learning-theoretic analysis of spike-timing dependent plasticity

no code implementations NeurIPS 2012 David Balduzzi, Michel Besserve

This paper suggests a learning-theoretic perspective on how synaptic plasticity benefits global brain functioning.

A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

no code implementations NeurIPS 2012 Pedro Ortega, Jordi Grau-Moya, Tim Genewein, David Balduzzi, Daniel Braun

We propose a novel Bayesian approach to solve stochastic optimization problems that involve finding extrema of noisy, nonlinear functions.

Stochastic Optimization

Quantifying causal influences

no code implementations29 Mar 2012 Dominik Janzing, David Balduzzi, Moritz Grosse-Wentrup, Bernhard Schölkopf

Here we propose a set of natural, intuitive postulates that a measure of causal strength should satisfy.

Statistics Theory Statistics Theory

Uncovering the Temporal Dynamics of Diffusion Networks

no code implementations3 May 2011 Manuel Gomez Rodriguez, David Balduzzi, Bernhard Schölkopf

Time plays an essential role in the diffusion of information, influence and disease over networks.

Detecting emergent processes in cellular automata with excess information

no code implementations1 May 2011 David Balduzzi

Many natural processes occur over characteristic spatial and temporal scales.

Information Theory Information Theory Cellular Automata and Lattice Gases Neurons and Cognition

Cannot find the paper you are looking for? You can Submit a new open access paper.