no code implementations • 13 May 2020 • Dominik Thalmeier, Hilbert J. Kappen, Simone Totaro, Vicenç Gómez
We identify PICE as the infinite smoothing limit of such technique and show that the sample efficiency problems that PICE suffers disappear for finite levels of smoothing.
no code implementations • 26 Oct 2017 • Carlo Baldassi, Federica Gerace, Hilbert J. Kappen, Carlo Lucibello, Luca Saglietti, Enzo Tartaglione, Riccardo Zecchina
Stochasticity and limited precision of synaptic weights in neural network models are key aspects of both biological and hardware modeling of learning processes.
no code implementations • NeurIPS 2011 • Mohammad Ghavamzadeh, Hilbert J. Kappen, Mohammad G. Azar, Rémi Munos
We introduce a new convergent variant of Q-learning, called speedy Q-learning, to address the problem of slow convergence in the standard form of the Q-learning algorithm.
no code implementations • NeurIPS 2008 • Joris M. Mooij, Hilbert J. Kappen
We propose a novel bound on single-variable marginal probability distributions in factor graphs with discrete variables.