no code implementations • NeurIPS 2012 • Thomas Furmston, David Barber
This analysis leads naturally to the consideration of this approximate Newton method as an alternative gradient-based method for Markov Decision Processes.
no code implementations • NeurIPS 2012 • Edward Challis, David Barber
We present a method for approximate inference for a broad class of non-conjugate probabilistic models.
no code implementations • 17 Aug 2014 • David Barber
We describe a set of Gaussian Process based approaches that can be used to solve non-linear Ordinary Differential Equations.
no code implementations • 22 Jun 2016 • David Barber, Aleksandar Botev
We consider training probabilistic classifiers in the case of a large number of classes.
no code implementations • 7 Jul 2016 • Aleksandar Botev, Guy Lever, David Barber
We present a unifying framework for adapting the update direction in gradient-based iterative optimization methods.
4 code implementations • NeurIPS 2017 • Thomas Anthony, Zheng Tian, David Barber
Sequential decision making problems, such as structured prediction, robotic control, and game playing, require a combination of planning policies and generalisation of those plans.
no code implementations • ICML 2017 • Aleksandar Botev, Hippolyt Ritter, David Barber
We present an efficient block-diagonal ap- proximation to the Gauss-Newton matrix for feedforward neural networks.
no code implementations • NeurIPS 2017 • Zhen He, Shao-Bing Gao, Liang Xiao, Daxue Liu, Hangen He, David Barber
The capacity of an LSTM network can be increased by widening and adding layers.
1 code implementation • ICLR 2018 • Hippolyt Ritter, Aleksandar Botev, David Barber
Pytorch implementations of Bayes By Backprop, MC Dropout, SGLD, the Local Reparametrization Trick, KF-Laplace and more
no code implementations • NeurIPS 2018 • Hippolyt Ritter, Aleksandar Botev, David Barber
In order to make our method scalable, we leverage recent block-diagonal Kronecker factored approximations to the curvature.
no code implementations • 12 Jun 2018 • Benoit Gaujac, Ilya Feige, David Barber
Generative models with both discrete and continuous latent variables are highly motivated by the structure of many real-world data sets.
no code implementations • 12 Jun 2018 • Alex Mansbridge, Roberto Fierimonte, Ilya Feige, David Barber
Powerful generative models, particularly in Natural Language Modelling, are commonly trained by maximizing a variational lower bound on the data log likelihood.
no code implementations • 13 Jun 2018 • Harshil Shah, Bowen Zheng, David Barber
We introduce the Attentive Unsupervised Text (W)riter (AUTR), which is a word level generative model for natural language.
no code implementations • NeurIPS 2018 • Harshil Shah, David Barber
We introduce Generative Neural Machine Translation (GNMT), a latent variable architecture which is designed to model the semantics of the source and target sentences.
1 code implementation • CVPR 2019 • Zhen He, Jian Li, Daxue Liu, Hangen He, David Barber
To achieve both label-free and end-to-end learning of MOT, we propose a Tracking-by-Animation framework, where a differentiable neural model first tracks objects from input frames and then animates these objects into reconstructed frames.
no code implementations • 13 Sep 2018 • Thomas Bird, Julius Kunze, David Barber
These approaches are of particular interest because they are parallelizable.
no code implementations • 27 Sep 2018 • Mingtian Zhang, Thomas Bird, Raza Habib, Tianlin Xu, David Barber
Probabilistic models are often trained by maximum likelihood, which corresponds to minimizing a specific form of f-divergence between the model and data distribution.
no code implementations • 27 Sep 2018 • Julius Kunze, Louis Kirsch, Hippolyt Ritter, David Barber
We propose Noisy Information Bottlenecks (NIB) to limit mutual information between learned parameters and the data through noise.
no code implementations • NeurIPS 2018 • Louis Kirsch, Julius Kunze, David Barber
Scaling model capacity has been vital in the success of deep learning.
no code implementations • 21 Nov 2018 • Mingtian Zhang, Peter Hayes, Tom Bird, Raza Habib, David Barber
For distributions $\mathbb{P}$ and $\mathbb{Q}$ with different supports or undefined densities, the divergence $\textrm{D}(\mathbb{P}||\mathbb{Q})$ may not exist.
2 code implementations • ICLR 2019 • James Townsend, Tom Bird, David Barber
Deep latent variable models have seen recent success in many data domains.
no code implementations • 12 Feb 2019 • Julius Kunze, Louis Kirsch, Hippolyt Ritter, David Barber
Variational inference with a factorized Gaussian posterior estimate is a widely used approach for learning parameters and hidden variables.
1 code implementation • ICLR 2019 • Raza Habib, David Barber
We introduce Auxiliary Variational MCMC, a novel framework for learning MCMC kernels that combines recent advances in variational inference with insights drawn from traditional auxiliary variable MCMC methods such as Hamiltonian Monte Carlo.
no code implementations • 27 Jul 2019 • Mingtian Zhang, Thomas Bird, Raza Habib, Tianlin Xu, David Barber
Probabilistic models are often trained by maximum likelihood, which corresponds to minimizing a specific f-divergence between the model and data distribution.
1 code implementation • ICLR 2020 • James Townsend, Thomas Bird, Julius Kunze, David Barber
We make the following striking observation: fully convolutional VAE models trained on 32x32 ImageNet can generalize well, not just to 64x64 but also to far larger photographs, with no changes to the model.
no code implementations • 14 Jan 2020 • David Barber
We introduce a general learning framework for private machine learning based on randomised response.
1 code implementation • 30 Apr 2020 • Pauching Yap, Hippolyt Ritter, David Barber
We demonstrate that the popular gradient-based model-agnostic meta-learning algorithm (MAML) indeed suffers from catastrophic forgetting and introduce a Bayesian online meta-learning framework that tackles this problem.
no code implementations • 28 Sep 2020 • Pauching Yap, Hippolyt Ritter, David Barber
This work introduces a Bayesian online meta-learning framework to tackle the catastrophic forgetting and the sequential few-shot tasks problems.
no code implementations • 7 Oct 2020 • Benoit Gaujac, Ilya Feige, David Barber
Probabilistic models with hierarchical-latent-variable structures provide state-of-the-art results amongst non-autoregressive, unsupervised density-based models.
no code implementations • 7 Oct 2020 • Benoit Gaujac, Ilya Feige, David Barber
We further study the trade off between disentanglement and reconstruction on more-difficult data sets with unknown generative factors, where the flexibility of the WAE paradigm in the reconstruction term improves reconstructions.
no code implementations • 23 Oct 2020 • Alex Mansbridge, Gregory Barbour, Davide Piras, Michael Murray, Christopher Frye, Ilya Feige, David Barber
In this work, our contributions are two-fold: first, by adapting state-of-the-art techniques from representation learning, we introduce a novel approach to learning LDP mechanisms.
no code implementations • ICLR 2021 • Thomas Bird, Friso H. Kingma, David Barber
In this work we show, for the first time, that we can successfully train generative models which utilize binary neural networks.
no code implementations • 1 Jan 2021 • Benoit Gaujac, Ilya Feige, David Barber
We further study the trade off between disentanglement and reconstruction on more-difficult data sets with unknown generative factors, where we expect improved reconstructions due to the flexibility of the WAE paradigm.
no code implementations • 1 Jan 2021 • Harshil Shah, David Barber
However, active learning methods usually use supervised training and ignore the data points which have not yet been labelled.
no code implementations • ICLR Workshop SSL-RL 2021 • Mingtian Zhang, Peter Noel Hayes, Tim Z. Xiao, Andi Zhang, David Barber
We introduce a new model-based reinforcement learning framework that aims to tackle environments with high dimensional state spaces.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 30 Mar 2021 • Harshil Shah, Tim Xiao, David Barber
Linear chain conditional random fields (CRFs) combined with contextual word embeddings have achieved state of the art performance on sequence labeling tasks.
no code implementations • 24 Sep 2021 • Emine Yilmaz, Peter Hayes, Raza Habib, Jordan Burgess, David Barber
Labelling data is a major practical bottleneck in training and testing classifiers.
1 code implementation • 30 Nov 2021 • Julius Kunze, James Townsend, David Barber
We propose a new, more general approach to the design of stochastic gradient-based optimization methods for machine learning.
2 code implementations • 13 Jan 2022 • Mingtian Zhang, James Townsend, Ning Kang, David Barber
The recently proposed Neural Local Lossless Compression (NeLLoC), which is based on a local autoregressive model, has achieved state-of-the-art (SOTA) out-of-distribution (OOD) generalization performance in the image compression task.
2 code implementations • 21 Mar 2022 • Ahmed H. Shahin, Joseph Jacob, Daniel C. Alexander, David Barber
To this end, we propose a probabilistic model that captures the dependencies between the observed clinical variables and imputes missing ones.
1 code implementation • 23 May 2022 • Mingtian Zhang, Peter Hayes, David Barber
The ability of likelihood-based probabilistic models to generalize to unseen data is central to many machine learning applications such as lossless compression.
no code implementations • 28 May 2022 • Mingtian Zhang, Tim Z. Xiao, Brooks Paige, David Barber
Latent variable models like the Variational Auto-Encoder (VAE) are commonly used to learn representations of images.
no code implementations • 19 Jun 2022 • Peter Hayes, Mingtian Zhang, Raza Habib, Jordan Burgess, Emine Yilmaz, David Barber
We introduce a label model that can learn to aggregate weak supervision sources differently for different datapoints and takes into consideration the performance of the end-model during training.
no code implementations • 15 Sep 2022 • Mingtian Zhang, Oscar Key, Peter Hayes, David Barber, Brooks Paige, François-Xavier Briol
Score-based divergences have been widely used in machine learning and statistics applications.
no code implementations • 15 Mar 2023 • David Barber
In Reinforcement Learning the Q-learning algorithm provably converges to the optimal solution.
no code implementations • 19 Mar 2023 • Yaozhi Lu, Shahab Aslani, An Zhao, Ahmed Shahin, David Barber, Mark Emberton, Daniel C. Alexander, Joseph Jacob
The Cox neural network can achieve an IPCW C-index of 0. 75 on the internal dataset and 0. 69 on an external dataset.
no code implementations • 18 May 2023 • Harshil Shah, Arthur Wilcke, Marius Cobzarenco, Cristi Cobzarenco, Edward Challis, David Barber
Natural language understanding includes the tasks of intent detection (identifying a user's objectives) and slot filling (extracting the entities relevant to those objectives).
1 code implementation • NeurIPS 2023 • Mingtian Zhang, Alex Hawkins-Hooker, Brooks Paige, David Barber
Energy-Based Models (EBMs) offer a versatile framework for modeling complex data distributions.
2 code implementations • 7 Sep 2023 • Ahmed H. Shahin, An Zhao, Alexander C. Whitehead, Daniel C. Alexander, Joseph Jacob, David Barber
We demonstrate that our approach forms a consistent estimator for the event model parameters, even in the absence of uncensored data.
1 code implementation • 5 Feb 2024 • Wenlin Chen, Mingtian Zhang, Brooks Paige, José Miguel Hernández-Lobato, David Barber
The inadequate mixing of conventional Markov Chain Monte Carlo (MCMC) methods for multi-modal distributions presents a significant challenge in practical applications such as Bayesian inference and molecular dynamics.
no code implementations • 12 Feb 2024 • William Muldrew, Peter Hayes, Mingtian Zhang, David Barber
A key consideration for aligning these models is how to most effectively use human resources, or model resources in the case where LLMs themselves are used as oracles.
no code implementations • 19 Feb 2024 • Mingtian Zhang, Shawn Lan, Peter Hayes, David Barber
Our results demonstrate that Mafin significantly enhances the performance of the black-box embeddings by only requiring the training of a small augmented model.
no code implementations • 27 Feb 2024 • Rares Dolga, Marius Cobzarenco, David Barber
The time complexity of the standard attention mechanism in a transformer scales quadratically with the length of the sequence.