no code implementations • 14 Nov 2023 • David Janz, Alexander E. Litvak, Csaba Szepesvári
We provide the first useful and rigorous analysis of ensemble sampling for the stochastic linear bandit setting.
no code implementations • 13 Nov 2023 • David Janz, Shuai Liu, Alex Ayoub, Csaba Szepesvári
We show that, for the case of generalised linear bandits, EVILL reduces to perturbed history exploration (PHE), a method where exploration is done by training on randomly perturbed rewards.
1 code implementation • 31 Oct 2023 • Jihao Andreas Lin, Shreyas Padhy, Javier Antorán, Austin Tripp, Alexander Terenin, Csaba Szepesvári, José Miguel Hernández-Lobato, David Janz
We study the optimisation problem associated with Gaussian process regression using squared loss.
1 code implementation • NeurIPS 2023 • Jihao Andreas Lin, Javier Antorán, Shreyas Padhy, David Janz, José Miguel Hernández-Lobato, Alexander Terenin
Gaussian processes are a powerful framework for quantifying uncertainty and for sequential decision-making but are limited by the requirement of solving linear systems.
1 code implementation • 10 Oct 2022 • Javier Antorán, Shreyas Padhy, Riccardo Barbano, Eric Nalisnick, David Janz, José Miguel Hernández-Lobato
Large-scale linear models are ubiquitous throughout machine learning, with contemporary application as surrogate models for neural network uncertainty quantification; that is, the linearised Laplace method.
no code implementations • 17 Jun 2022 • Javier Antorán, David Janz, James Urquhart Allingham, Erik Daxberger, Riccardo Barbano, Eric Nalisnick, José Miguel Hernández-Lobato
The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community.
no code implementations • pproximateinference AABI Symposium 2022 • Javier Antoran, James Urquhart Allingham, David Janz, Erik Daxberger, Eric Nalisnick, José Miguel Hernández-Lobato
We show that for neural networks (NN) with normalisation layers, i. e. batch norm, layer norm, or group norm, the Laplace model evidence does not approximate the volume of a posterior mode and is thus unsuitable for model selection.
no code implementations • 28 Jan 2020 • David Janz, David R. Burt, Javier González
We consider the problem of optimising functions in the reproducing kernel Hilbert space (RKHS) of a Mat\'ern kernel with smoothness parameter $\nu$ over the domain $[0, 1]^d$ under noisy bandit feedback.
2 code implementations • NeurIPS 2019 • David Janz, Jiri Hron, Przemysław Mazur, Katja Hofmann, José Miguel Hernández-Lobato, Sebastian Tschiatschek
Posterior sampling for reinforcement learning (PSRL) is an effective method for balancing exploration and exploitation in reinforcement learning.
7 code implementations • 1 Jul 2018 • Alex Kendall, Jeffrey Hawke, David Janz, Przemyslaw Mazur, Daniele Reda, John-Mark Allen, Vinh-Dieu Lam, Alex Bewley, Amar Shah
We demonstrate the first application of deep reinforcement learning to autonomous driving.
1 code implementation • ICLR 2018 • David Janz, Jos van der Westhuizen, Brooks Paige, Matt J. Kusner, José Miguel Hernández-Lobato
This validator provides insight as to how individual sequence elements influence the validity of the overall sequence, and can be used to constrain sequence based models to generate valid sequences -- and thus faithfully model discrete objects.
no code implementations • 15 Aug 2017 • David Janz, Jos van der Westhuizen, José Miguel Hernández-Lobato
As a step towards solving this problem, we propose to learn a deep recurrent validator model.
no code implementations • 21 Nov 2016 • David Janz, Brooks Paige, Tom Rainforth, Jan-Willem van de Meent, Frank Wood
Existing methods for structure discovery in time series data construct interpretable, compositional kernels for Gaussian process regression models.