no code implementations • 27 Feb 2020 • Jonathan Baxter
A Machine can only learn if it is biased in some way.
no code implementations • 18 Nov 2019 • Jonathan Baxter
A form of generalisation error known as Off Training Set (OTS) error was recently introduced in [Wolpert, 1996b], along with a theorem showing that small training set error does not guarantee small OTS error, unless assumptions are made about the target function.
no code implementations • 18 Nov 2019 • Douglas Aberdeen, Jonathan Baxter
Generalised matrix-matrix multiplication forms the kernel of many mathematical algorithms.
no code implementations • 17 Nov 2019 • Peter L. Bartlett, Jonathan Baxter
In this paper, we derive a new model of synaptic plasticity, based on recent algorithms for reinforcement learning (in which an agent attempts to learn appropriate actions to maximize its long-term average reward).
no code implementations • 14 Nov 2019 • Jonathan Baxter
In this paper the problem of {\em learning} appropriate domain-specific bias is addressed.
no code implementations • 14 Nov 2019 • Jonathan Baxter
In this paper the problem of learning appropriate bias for an environment of related tasks is examined from a Bayesian perspective.
no code implementations • 14 Nov 2019 • Jonathan Baxter
In this paper it is shown how an {\em environment} of functions on an input space $X$ induces a {\em canonical distortion measure} (CDM) on X.
no code implementations • 13 Nov 2019 • Jonathan Baxter
It is proved that the number of examples $m$ {\em per task} required to ensure good generalisation from a representation learner obeys $m = O(a+b/n)$ where $n$ is the number of tasks being learnt and $a$ and $b$ are constants.
no code implementations • 12 Nov 2019 • Douglas Aberdeen, Jonathan Baxter, Robert Edwards
The training runs with a average performance of 163. 3 GFlops/s (single precision).
no code implementations • 9 Nov 2019 • Jonathan Baxter
In this thesis it is argued that in general there is insufficient information in a single task for a learner to generalise well and that what is required for good generalisation is information about many similar learning tasks.
no code implementations • 3 Jun 2011 • Jonathan Baxter, Peter L. Bartlett
In this paper we introduce GPOMDP, a simulation-based algorithm for generating a {\em biased} estimate of the gradient of the {\em average reward} in Partially Observable Markov Decision Processes (POMDPs) controlled by parameterized stochastic policies.
no code implementations • 10 Jan 1999 • Jonathan Baxter, Andrew Tridgell, Lex Weaver
In this paper we present TDLeaf(lambda), a variation on the TD(lambda) algorithm that enables it to be used in conjunction with game-tree search.