Hebbian Synaptic Modifications in Spiking Neurons that Learn

17 Nov 2019 · Peter L. Bartlett, Jonathan Baxter ·

In this paper, we derive a new model of synaptic plasticity, based on recent algorithms for reinforcement learning (in which an agent attempts to learn appropriate actions to maximize its long-term average reward). We show that these direct reinforcement learning algorithms also give locally optimal performance for the problem of reinforcement learning with multiple agents, without any explicit communication between agents. By considering a network of spiking neurons as a collection of agents attempting to maximize the long-term average of a reward signal, we derive a synaptic update rule that is qualitatively similar to Hebb's postulate. This rule requires only simple computations, such as addition and leaky integration, and involves only quantities that are available in the vicinity of the synapse. Furthermore, it leads to synaptic connection strengths that give locally optimal values of the long term average reward. The reinforcement learning paradigm is sufficiently broad to encompass many learning problems that are solved by the brain. We illustrate, with simulations, that the approach is effective for simple pattern classification and motor learning tasks.

PDF Abstract