2 code implementations • 17 Feb 2025 • Aliaksandra Shysheya, John Bronskill, James Requeima, Shoaib Ahmed Siddiqui, Javier Gonzalez, David Duvenaud, Richard E. Turner
We introduce a simple method for probabilistic predictions on tabular data based on Large Language Models (LLMs) called JoLT (Joint LLM Process for Tabular data).
no code implementations • 10 Feb 2025 • Wu Lin, Felix Dangel, Runa Eschenhagen, Juhan Bae, Richard E. Turner, Roger B. Grosse
Many training methods, such as Adam(W) and Shampoo, learn a positive-definite curvature matrix and apply an inverse root before preconditioning.
no code implementations • 7 Feb 2025 • Thang D. Bui, Matthew Ashman, Richard E. Turner
Extensive experiments on regression benchmarks, classification, and latent variable models demonstrate that the proposed approximation consistently matches or outperforms standard sparse variational GPs while maintaining the same computational cost.
no code implementations • 6 Feb 2025 • Aristeidis Panos, Rahaf Aljundi, Daniel Olmeda Reino, Richard E. Turner
Experimental results on VQA tasks in the few-shot continual learning setting, validate LoRSU's scalability, efficiency, and effectiveness, making it a compelling solution for image encoder adaptation in resource-constrained environments.
1 code implementation • 26 Jan 2025 • Kenza Tazi, Sun Woo P. Kim, Marc Girona-Mata, Richard E. Turner
While multi-model ensembles are known to improve the predictive accuracy and analysis of future climate projections, consensus regarding how models are aggregated is lacking.
1 code implementation • 21 Oct 2024 • Aliaksandra Shysheya, Cristiana Diaconu, Federico Bergamin, Paris Perdikaris, José Miguel Hernández-Lobato, Richard E. Turner, Emile Mathieu
Modelling partial differential equations (PDEs) is of crucial importance in science and engineering, and it includes tasks ranging from forecasting to inverse problems, such as data assimilation.
no code implementations • 9 Oct 2024 • Matthew Ashman, Cristiana Diaconu, Eric Langezaal, Adrian Weller, Richard E. Turner
Recently, transformer-based approaches have shown great promise in a range of weather forecasting problems.
no code implementations • 4 Oct 2024 • Isaac Reid, Kumar Avinava Dubey, Deepali Jain, Will Whitney, Amr Ahmed, Joshua Ainslie, Alex Bewley, Mithun Jacob, Aranyak Mehta, David Rendleman, Connor Schenck, Richard E. Turner, René Wagner, Adrian Weller, Krzysztof Choromanski
When training transformers on graph-structured data, incorporating information about the underlying topology is crucial for good performance.
no code implementations • 8 Aug 2024 • Anna Vaughan, Gonzalo Mateo-Garcia, Itziar Irakulis-Loitxate, Marc Watine, Pablo Fernandez-Poblaciones, Richard E. Turner, James Requeima, Javier Gorroño, Cynthia Randles, Manfredi Caltagirone, Claudio Cifarelli
Mitigating methane emissions is the fastest way to stop global warming in the short-term and buy humanity time to decarbonise.
no code implementations • 19 Jun 2024 • Matthew Ashman, Cristiana Diaconu, Adrian Weller, Richard E. Turner
Standard NP architectures, such as the convolutional conditional NP (ConvCNP) or the family of transformer neural processes (TNPs), are not capable of in-context in-context learning, as they are only able to condition on a single dataset.
no code implementations • 19 Jun 2024 • Yarden Cohen, Alexandre Khae Wu Navarro, Jes Frellsen, Richard E. Turner, Raziel Riemer, Ari Pakman
The need for regression models to predict circular values arises in many scientific fields.
1 code implementation • 19 Jun 2024 • Matthew Ashman, Cristiana Diaconu, Adrian Weller, Wessel Bruinsma, Richard E. Turner
Our approach is agnostic to both the choice of symmetry group and model architecture, making it widely applicable.
1 code implementation • 18 Jun 2024 • Matthew Ashman, Cristiana Diaconu, Junhyuck Kim, Lakee Sivaraya, Stratis Markou, James Requeima, Wessel P. Bruinsma, Richard E. Turner
Notably, the posterior prediction maps for data that are stationary -- a common assumption in spatio-temporal modelling -- exhibit translation equivariance.
no code implementations • 12 Jun 2024 • Ossi Räisä, Stratis Markou, Matthew Ashman, Wessel P. Bruinsma, Marlon Tobaben, Antti Honkela, Richard E. Turner
One approach to mitigating this issue is pre-training models on simulated data before DP learning on the private data.
1 code implementation • 3 Jun 2024 • Jonathan So, Richard E. Turner
We use this insight to motivate two new EP variants, with updates that are particularly well-suited to MC estimation.
no code implementations • 26 May 2024 • Isaac Reid, Stratis Markou, Krzysztof Choromanski, Richard E. Turner, Adrian Weller
We reach surprising conclusions about the benefits and limitations of variance reduction as a paradigm, showing that other properties of the coupling should be optimised for attention estimation in efficient transformers.
1 code implementation • 21 May 2024 • James Requeima, John Bronskill, Dami Choi, Richard E. Turner, David Duvenaud
Machine learning practitioners often face significant challenges in formally integrating their prior knowledge and beliefs into predictive models, limiting the potential for nuanced and context-aware analyses.
no code implementations • 20 May 2024 • Cristian Bodnar, Wessel P. Bruinsma, Ana Lucic, Megan Stanley, Anna Vaughan, Johannes Brandstetter, Patrick Garvan, Maik Riechert, Jonathan A. Weyn, Haiyu Dong, Jayesh K. Gupta, Kit Thambiratnam, Alexander T. Archibald, Chun-Chieh Wu, Elizabeth Heider, Max Welling, Richard E. Turner, Paris Perdikaris
Reliable forecasts of the Earth system are crucial for human progress and safety from natural disasters.
no code implementations • 30 Mar 2024 • Anna Vaughan, Stratis Markou, Will Tebbutt, James Requeima, Wessel P. Bruinsma, Tom R. Andersson, Michael Herzog, Nicholas D. Lane, Matthew Chantry, J. Scott Hosking, Richard E. Turner
Local station forecasts are skillful up to ten days lead time and achieve comparable and often lower errors than a post-processed global NWP baseline and a state-of-the-art end-to-end forecasting system with input from human forecasters.
1 code implementation • 4 Mar 2024 • James Urquhart Allingham, Bruno Kacper Mlodozeniec, Shreyas Padhy, Javier Antorán, David Krueger, Richard E. Turner, Eric Nalisnick, José Miguel Hernández-Lobato
Correctly capturing the symmetry transformations of data can lead to efficient models with strong generalization capabilities, though methods incorporating symmetries often require prior knowledge.
no code implementations • 10 Feb 2024 • Lachlan Thorpe, Lewis Bawden, Karanjot Vendal, John Bronskill, Richard E. Turner
By training on a large database of professional tennis tracking data, we demonstrate that simulations produced by SportsNGEN can be used to predict the outcomes of rallies, determine the best shot choices at any point, and evaluate counterfactual or what if scenarios to inform coaching decisions and elevate broadcast coverage.
no code implementations • 6 Feb 2024 • Richard E. Turner, Cristiana-Diana Diaconu, Stratis Markou, Aliaksandra Shysheya, Andrew Y. K. Foong, Bruno Mlodozeniec
Denoising Diffusion Probabilistic Models (DDPMs) are a very popular class of deep generative model that have been successfully applied to a diverse range of problems including image and video generation, protein and material synthesis, weather forecasting, and neural surrogates of partial differential equations.
2 code implementations • 5 Feb 2024 • Wu Lin, Felix Dangel, Runa Eschenhagen, Juhan Bae, Richard E. Turner, Alireza Makhzani
Adaptive gradient optimizers like Adam(W) are the default training algorithms for many deep learning architectures, such as transformers.
no code implementations • 3 Jan 2024 • Massimiliano Patacchiola, Aliaksandra Shysheya, Katja Hofmann, Richard E. Turner
In this paper, we propose a novel solution to these challenges by exploiting transformers to define a new class of neural flows called Transformer Neural Autoregressive Flows (T-NAFs).
2 code implementations • 9 Dec 2023 • Wu Lin, Felix Dangel, Runa Eschenhagen, Kirill Neklyudov, Agustinus Kristiadi, Richard E. Turner, Alireza Makhzani
Second-order methods such as KFAC can be useful for neural net training.
no code implementations • 28 Nov 2023 • Hermanni Hälvä, Jonathan So, Richard E. Turner, Aapo Hyvärinen
In this paper, we introduce a new nonlinear ICA framework that employs $t$-process (TP) latent components which apply naturally to data with higher-dimensional dependency structures, such as spatial and spatio-temporal data.
no code implementations • 16 Nov 2023 • Lorenzo Bonito, James Requeima, Aliaksandra Shysheya, Richard E. Turner
Over the last few years, Neural Processes have become a useful modelling tool in many application areas, such as healthcare and climate sciences, in which data are scarce and prediction uncertainty estimates are indispensable.
no code implementations • NeurIPS 2023 • Runa Eschenhagen, Alexander Immer, Richard E. Turner, Frank Schneider, Philipp Hennig
In this work, we identify two different settings of linear weight-sharing layers which motivate two flavours of K-FAC -- $\textit{expand}$ and $\textit{reduce}$.
1 code implementation • 30 Oct 2023 • Jonas Scholz, Tom R. Andersson, Anna Vaughan, James Requeima, Richard E. Turner
On held-out weather stations, Sim2Real training substantially outperforms the same model architecture trained only with reanalysis data or only with station data, showing that reanalysis data can serve as a stepping stone for learning from real observations.
1 code implementation • 18 Oct 2023 • Jonathan So, Richard E. Turner
In this work we propose a novel technique for tackling such issues, which involves reframing the optimisation as one with respect to the parameters of a surrogate distribution, for which computing the natural gradient is easy.
1 code implementation • NeurIPS 2023 • Emile Mathieu, Vincent Dutordoir, Michael J. Hutchinson, Valentin De Bortoli, Yee Whye Teh, Richard E. Turner
In this work, we extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling.
1 code implementation • 6 Jul 2023 • Kenza Tazi, Jihao Andreas Lin, Ross Viljoen, Alex Gardner, ST John, Hong Ge, Richard E. Turner
Gaussian Processes (GPs) offer an attractive method for regression over small, structured and correlated datasets.
4 code implementations • 23 Jun 2023 • Massimiliano Patacchiola, Mingfei Sun, Katja Hofmann, Richard E. Turner
Despite its simplicity this baseline is competitive with meta-learning methods on a variety of conditions and is able to imitate target policies trained on unseen variations of the original environment.
Few-Shot Image Classification
Few-Shot Imitation Learning
+4
no code implementations • 20 Apr 2023 • Richard E. Turner
The transformer is a neural network component that can be used to learn useful representations of sequences or sets of data-points.
1 code implementation • 25 Mar 2023 • Wessel P. Bruinsma, Stratis Markou, James Requiema, Andrew Y. K. Foong, Tom R. Andersson, Anna Vaughan, Anthony Buonomo, J. Scott Hosking, Richard E. Turner
Our work provides an example of how ideas from neural distribution estimation can benefit neural processes, and motivates research into the AR deployment of other neural process models.
no code implementations • ICCV 2023 • Aristeidis Panos, Yuriko Kobe, Daniel Olmeda Reino, Rahaf Aljundi, Richard E. Turner
In this work, we develop a baseline method, First Session Adaptation (FSA), that sheds light on the efficacy of existing CIL approaches and allows us to assess the relative performance contributions from head and body adaption.
no code implementations • 23 Nov 2022 • Elre T. Oldewage, John Bronskill, Richard E. Turner
This paper examines the robustness of deployed few-shot meta-learning systems when they are fed an imperceptibly perturbed few-shot dataset.
1 code implementation • 18 Nov 2022 • Tom R. Andersson, Wessel P. Bruinsma, Stratis Markou, James Requeima, Alejandro Coca-Castro, Anna Vaughan, Anna-Louise Ellis, Matthew A. Lazzara, Dani Jones, J. Scott Hosking, Richard E. Turner
This paper proposes using a convolutional Gaussian neural process (ConvGNP) to address these issues.
1 code implementation • 29 Oct 2022 • Aditya Ravuri, Tom R. Andersson, Ieva Kazlauskaite, Will Tebbutt, Richard E. Turner, J. Scott Hosking, Neil D. Lawrence, Markus Kaiser
Ice cores record crucial information about past climate.
1 code implementation • 23 Sep 2022 • Mikko A. Heikkilä, Matthew Ashman, Siddharth Swaroop, Richard E. Turner, Antti Honkela
In this paper, we present differentially private partitioned variational inference, the first general framework for learning a variational approximation to a Bayesian posterior distribution in the federated learning setting while minimising the number of communication rounds and providing differential privacy guarantees for data subjects.
1 code implementation • 11 Sep 2022 • Vidhi Lalchand, Kenza Tazi, Talay M. Cheema, Richard E. Turner, Scott Hosking
We account for the spatial variation in precipitation with a non-stationary Gibbs kernel parameterised with an input dependent lengthscale.
1 code implementation • 1 Sep 2022 • Saurav Jha, Dong Gong, Xuesong Wang, Richard E. Turner, Lina Yao
We shed light on their potential to bring several recent advances in other deep learning domains under one umbrella.
1 code implementation • 20 Jun 2022 • Massimiliano Patacchiola, John Bronskill, Aliaksandra Shysheya, Katja Hofmann, Sebastian Nowozin, Richard E. Turner
In this paper we push this Pareto frontier in the few-shot image classification setting with a key contribution: a new adaptive block called Contextual Squeeze-and-Excitation (CaSE) that adjusts a pretrained neural network on a new task to significantly improve performance with a single forward pass of the user data (context).
Ranked #5 on
Few-Shot Image Classification
on Meta-Dataset
no code implementations • 18 May 2022 • Isabel Chien, Nina Deliu, Richard E. Turner, Adrian Weller, Sofia S. Villar, Niki Kilbertus
While interest in the application of machine learning to improve healthcare has grown tremendously in recent years, a number of barriers prevent deployment in medical practice.
no code implementations • 16 Mar 2022 • Stratis Markou, James Requeima, Wessel P. Bruinsma, Anna Vaughan, Richard E. Turner
Existing approaches which model output dependencies, such as Neural Processes (NPs; Garnelo et al., 2018b) or the FullConvGNP (Bruinsma et al., 2021), are either complicated to train or prohibitively expensive.
3 code implementations • 14 Mar 2022 • Wessel P. Bruinsma, Martin Tegnér, Richard E. Turner
The Gaussian Process Convolution Model (GPCM; Tobar et al., 2015a) is a model for signals with complex spectral structure.
1 code implementation • 24 Feb 2022 • Matthew Ashman, Thang D. Bui, Cuong V. Nguyen, Stratis Markou, Adrian Weller, Siddharth Swaroop, Richard E. Turner
Variational inference (VI) has become the method of choice for fitting many modern probabilistic models.
2 code implementations • NeurIPS 2021 • John Bronskill, Daniela Massiceti, Massimiliano Patacchiola, Katja Hofmann, Sebastian Nowozin, Richard E. Turner
This limitation arises because a task's entire support set, which can contain up to 1000 images, must be processed before an optimization step can be taken.
1 code implementation • 24 Jun 2021 • Rahaf Aljundi, Daniel Olmeda Reino, Nikolay Chumerin, Richard E. Turner
This work identifies the crucial link between the two problems and investigates the Novelty Detection problem under the Continual Learning setting.
1 code implementation • pproximateinference AABI Symposium 2021 • Will Tebbutt, Arno Solin, Richard E. Turner
Pseudo-point approximations, one of the gold-standard methods for scaling GPs to large data sets, are well suited for handling off-the-grid spatial data.
1 code implementation • NeurIPS 2021 • Andrew Y. K. Foong, Wessel P. Bruinsma, David R. Burt, Richard E. Turner
Interestingly, this lower bound recovers the Chernoff test set bound if the posterior is equal to the prior.
no code implementations • 12 Apr 2021 • Angus Lamb, Evgeny Saveliev, Yingzhen Li, Sebastian Tschiatschek, Camilla Longden, Simon Woodhead, José Miguel Hernández-Lobato, Richard E. Turner, Pashmina Cameron, Cheng Zhang
While deep learning has obtained state-of-the-art results in many applications, the adaptation of neural network architectures to incorporate new output features remains a challenge, as neural networks are commonly trained to produce a fixed output dimension.
1 code implementation • NeurIPS Workshop ICBINB 2020 • Vincent Fortuin, Adrià Garriga-Alonso, Sebastian W. Ober, Florian Wenzel, Gunnar Rätsch, Richard E. Turner, Mark van der Wilk, Laurence Aitchison
Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference.
1 code implementation • 20 Jan 2021 • Anna Vaughan, Will Tebbutt, J. Scott Hosking, Richard E. Turner
A new model is presented for multisite statistical downscaling of temperature and precipitation using convolutional conditional neural processes (convCNPs).
1 code implementation • pproximateinference AABI Symposium 2021 • Wessel P. Bruinsma, James Requeima, Andrew Y. K. Foong, Jonathan Gordon, Richard E. Turner
Neural Processes (NPs; Garnelo et al., 2018a, b) are a rich class of models for meta-learning that map data sets directly to predictive stochastic processes.
no code implementations • ICLR 2021 • Noel Loo, Siddharth Swaroop, Richard E. Turner
One strand of research has used probabilistic regularization for continual learning, with two of the main approaches in this vein being Online Elastic Weight Consolidation (Online EWC) and Variational Continual Learning (VCL).
1 code implementation • 20 Oct 2020 • Matthew Ashman, Jonathan So, Will Tebbutt, Vincent Fortuin, Michael Pearce, Richard E. Turner
Large, multi-dimensional spatio-temporal datasets are omnipresent in modern science and engineering.
no code implementations • 24 Jul 2020 • Chaochao Lu, Richard E. Turner, Yingzhen Li, Nate Kushman
In this paper we provide a firm theoretical interpretation for infinite spatial generation, by drawing connections to spatial stochastic processes.
no code implementations • 23 Jul 2020 • Zichao Wang, Angus Lamb, Evgeny Saveliev, Pashmina Cameron, Yordan Zaykov, José Miguel Hernández-Lobato, Richard E. Turner, Richard G. Baraniuk, Craig Barton, Simon Peyton Jones, Simon Woodhead, Cheng Zhang
In this competition, participants will focus on the students' answer records to these multiple-choice diagnostic questions, with the aim of 1) accurately predicting which answers the students provide; 2) accurately predicting which questions have high quality; and 3) determining a personalized sequence of questions for each student that best predicts the student's answers.
2 code implementations • NeurIPS 2020 • Andrew Y. K. Foong, Wessel P. Bruinsma, Jonathan Gordon, Yann Dubois, James Requeima, Richard E. Turner
Stationary stochastic processes (SPs) are a key component of many probabilistic models, such as those for off-the-grid spatio-temporal data.
1 code implementation • NeurIPS 2020 • Pingbo Pan, Siddharth Swaroop, Alexander Immer, Runa Eschenhagen, Richard E. Turner, Mohammad Emtiyaz Khan
Continually learning new skills is important for intelligent systems, yet standard deep learning methods suffer from catastrophic forgetting of the past.
2 code implementations • ICML 2020 • John Bronskill, Jonathan Gordon, James Requeima, Sebastian Nowozin, Richard E. Turner
Modern meta-learning approaches for image classification rely on increasingly deep networks to achieve state-of-the-art performance, making batch normalization an essential component of meta-learning pipelines.
1 code implementation • NeurIPS 2019 • Wenbo Gong, Sebastian Tschiatschek, Sebastian Nowozin, Richard E. Turner, José Miguel Hernández-Lobato, Cheng Zhang
In this paper, we address the ice-start problem, i. e., the challenge of deploying machine learning models when only a little or no training data is initially available, and acquiring each feature element of data is associated with costs.
no code implementations • 26 Nov 2019 • Bo-Hsiang Tseng, Marek Rei, Paweł Budzianowski, Richard E. Turner, Bill Byrne, Anna Korhonen
Dialogue systems benefit greatly from optimizing on detailed annotations, such as transcribed utterances, internal dialogue state representations and dialogue act labels.
1 code implementation • 24 Nov 2019 • Mrinank Sharma, Michael Hutchinson, Siddharth Swaroop, Antti Honkela, Richard E. Turner
This setting is known as federated learning, in which privacy is a key concern.
no code implementations • ICLR 2020 • Tameem Adel, Han Zhao, Richard E. Turner
Approaches to continual learning aim to successfully learn a set of related tasks that arrive in an online manner.
3 code implementations • ICML 2020 • Wessel P. Bruinsma, Eric Perim, Will Tebbutt, J. Scott Hosking, Arno Solin, Richard E. Turner
Multi-output Gaussian processes (MOGPs) leverage the flexibility and interpretability of GPs while capturing structure across outputs, which is desirable, for example, in spatio-temporal modelling.
3 code implementations • ICLR 2020 • Jonathan Gordon, Wessel P. Bruinsma, Andrew Y. K. Foong, James Requeima, Yann Dubois, Richard E. Turner
We introduce the Convolutional Conditional Neural Process (ConvCNP), a new member of the Neural Process family that models translation equivariance in the data.
no code implementations • 5 Sep 2019 • Jan Stühmer, Richard E. Turner, Sebastian Nowozin
Second, we demonstrate that the proposed prior encourages a disentangled latent representation which facilitates learning of disentangled representations.
2 code implementations • NeurIPS 2020 • Andrew Y. K. Foong, David R. Burt, Yingzhen Li, Richard E. Turner
While Bayesian neural networks (BNNs) hold the promise of being flexible, well-calibrated statistical models, inference often requires approximations whose consequences are poorly understood.
no code implementations • 27 Jun 2019 • Andrew Y. K. Foong, Yingzhen Li, José Miguel Hernández-Lobato, Richard E. Turner
We describe a limitation in the expressiveness of the predictive uncertainty estimate given by mean-field variational inference (MFVI), a popular approximate inference method for Bayesian neural networks.
1 code implementation • NeurIPS 2019 • James Requeima, Jonathan Gordon, John Bronskill, Sebastian Nowozin, Richard E. Turner
We introduce a conditional neural process based approach to the multi-task classification setting for this purpose, and establish connections to the meta-learning and few-shot learning literature.
Ranked #6 on
Few-Shot Image Classification
on Meta-Dataset Rank
1 code implementation • NeurIPS 2019 • Kazuki Osawa, Siddharth Swaroop, Anirudh Jain, Runa Eschenhagen, Richard E. Turner, Rio Yokota, Mohammad Emtiyaz Khan
Importantly, the benefits of Bayesian principles are preserved: predictive probabilities are well-calibrated, uncertainties on out-of-distribution data are improved, and continual-learning performance is boosted.
no code implementations • 24 May 2019 • Josef Schlittenlacher, Richard E. Turner, Brian C. J. Moore
The DNN was trained using the output of a more complex model, called the Cambridge loudness model.
1 code implementation • 6 May 2019 • Siddharth Swaroop, Cuong V. Nguyen, Thang D. Bui, Richard E. Turner
In the continual learning setting, tasks are encountered sequentially.
no code implementations • ICLR 2019 • Tameem Adel, Cuong V. Nguyen, Richard E. Turner, Zoubin Ghahramani, Adrian Weller
We present a framework for interpretable continual learning (ICL).
no code implementations • NeurIPS 2018 • Mark Rowland, Krzysztof M. Choromanski, François Chalus, Aldo Pacchiano, Tamas Sarlos, Richard E. Turner, Adrian Weller
Monte Carlo sampling in high-dimensional, low-sample settings is important in many machine learning tasks.
no code implementations • 27 Nov 2018 • Thang D. Bui, Cuong V. Nguyen, Siddharth Swaroop, Richard E. Turner
Second, the granularity of the updates e. g. whether the updates are local to each data point and employ message passing or global.
1 code implementation • NeurIPS 2018 • Arno Solin, James Hensman, Richard E. Turner
The complexity is still cubic in the state dimension $m$ which is an impediment to practical application.
3 code implementations • ICLR 2019 • Anqi Wu, Sebastian Nowozin, Edward Meeds, Richard E. Turner, José Miguel Hernández-Lobato, Alexander L. Gaunt
We provide two innovations that aim to turn VB into a robust inference tool for Bayesian neural networks: first, we introduce a novel deterministic method to approximate moments in neural networks, eliminating gradient variance; second, we introduce a hierarchical prior for parameters and a novel Empirical Bayes procedure for automatically selecting prior variances.
1 code implementation • ICLR 2019 • Jonathan Gordon, John Bronskill, Matthias Bauer, Sebastian Nowozin, Richard E. Turner
2) We introduce VERSA, an instance of the framework employing a flexible and versatile amortization network that takes few-shot learning datasets as inputs, with arbitrary numbers of shots, and outputs a distribution over task-specific parameters in a single forward pass.
1 code implementation • 22 May 2018 • Aapo Hyvarinen, Hiroaki Sasaki, Richard E. Turner
Here, we propose a general framework for nonlinear ICA, which, as a special case, can make use of temporal structure.
2 code implementations • ICLR 2018 • Alexander G. de G. Matthews, Mark Rowland, Jiri Hron, Richard E. Turner, Zoubin Ghahramani
Whilst deep neural networks have shown great empirical success, there is still much work to be done to understand their theoretical properties.
no code implementations • ICML 2018 • Krzysztof Choromanski, Mark Rowland, Vikas Sindhwani, Richard E. Turner, Adrian Weller
We present a new method of blackbox optimization via gradient approximation with the use of structured random orthogonal matrices, providing more accurate estimators than baselines and with provable theoretical guarantees.
1 code implementation • ICML 2018 • George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance.
no code implementations • 22 Feb 2018 • Wessel Bruinsma, Richard E. Turner
We present the Causal Gaussian Process Convolution Model (CGPCM), a doubly nonparametric model for causal, spectrally complex dynamical phenomena.
3 code implementations • 20 Feb 2018 • James Requeima, Will Tebbutt, Wessel Bruinsma, Richard E. Turner
Multi-output regression models must exploit dependencies between outputs to maximise predictive performance.
no code implementations • 14 Feb 2018 • Brian L. Trippe, Richard E. Turner
Modeling complex conditional distributions is critical in a variety of settings.
8 code implementations • ICLR 2018 • Cuong V. Nguyen, Yingzhen Li, Thang D. Bui, Richard E. Turner
This paper develops variational continual learning (VCL), a simple but general framework for continual learning that fuses online variational inference (VI) and recent advances in Monte Carlo VI for neural networks.
no code implementations • ICLR 2018 • Matthias Bauer, Mateo Rojas-Carulla, Jakub Bartłomiej Świątkowski, Bernhard Schölkopf, Richard E. Turner
The goal is to generalise from an initial large-scale classification task to a separate task comprising new classes and small numbers of examples.
no code implementations • NeurIPS 2017 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine
Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques.
1 code implementation • ICLR 2018 • Yingzhen Li, Richard E. Turner
Implicit models, which allow for the generation of samples but not for point-wise evaluation of probabilities, are omnipresent in real-world problems tackled by machine learning and a hot topic of current research.
3 code implementations • NeurIPS 2017 • Thang D. Bui, Cuong V. Nguyen, Richard E. Turner
Sparse pseudo-point approximations for Gaussian process (GP) models provide a suite of methods that support deployment of GPs in the large data regime and enable analytic intractabilities to be sidestepped.
no code implementations • 27 Feb 2017 • Yingzhen Li, Richard E. Turner, Qiang Liu
We propose a novel approximate inference algorithm that approximates a target distribution by amortising the dynamics of a user-selected MCMC sampler.
no code implementations • ICML 2017 • Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, José Miguel Hernández-Lobato, Richard E. Turner, Douglas Eck
This paper proposes a general method for improving the structure and quality of sequences generated by a recurrent neural network (RNN), while maintaining information originally learned from data, as well as sample diversity.
2 code implementations • 7 Nov 2016 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine
We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation.
1 code implementation • 23 May 2016 • Thang D. Bui, Josiah Yan, Richard E. Turner
Unlike much of the previous venerable work in this area, the new framework is built on standard methods for approximate inference (variational free-energy, EP and Power EP methods) rather than employing approximations to the probabilistic generative model itself.
no code implementations • 16 Feb 2016 • Alexandre K. W. Navarro, Jes Frellsen, Richard E. Turner
First we introduce a new multivariate distribution over circular variables, called the multivariate Generalised von Mises (mGvM) distribution.
no code implementations • 12 Feb 2016 • Thang D. Bui, Daniel Hernández-Lobato, Yingzhen Li, José Miguel Hernández-Lobato, Richard E. Turner
Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (GPs) and are formally equivalent to neural networks with multiple, infinitely wide hidden layers.
2 code implementations • NeurIPS 2016 • Yingzhen Li, Richard E. Turner
This paper introduces the variational R\'enyi bound (VR) that extends traditional variational inference to R\'enyi's alpha-divergences.
no code implementations • NeurIPS 2015 • Felipe Tobar, Thang D. Bui, Richard E. Turner
We introduce the Gaussian Process Convolution Model (GPCM), a two-stage nonparametric generative procedure to model stationary signals as the convolution between a continuous-time white-noise process and a continuous-time linear filter drawn from Gaussian process.
no code implementations • 11 Nov 2015 • Thang D. Bui, José Miguel Hernández-Lobato, Yingzhen Li, Daniel Hernández-Lobato, Richard E. Turner
Deep Gaussian processes (DGPs) are multi-layer hierarchical generalisations of Gaussian processes (GPs) and are formally equivalent to neural networks with multiple, infinitely wide hidden layers.
no code implementations • 10 Nov 2015 • Daniel Hernández-Lobato, José Miguel Hernández-Lobato, Yingzhen Li, Thang Bui, Richard E. Turner
A method for large scale Gaussian process classification has been recently proposed based on expectation propagation (EP).
3 code implementations • 10 Nov 2015 • José Miguel Hernández-Lobato, Yingzhen Li, Mark Rowland, Daniel Hernández-Lobato, Thang Bui, Richard E. Turner
Black-box alpha (BB-$\alpha$) is a new approximate inference method based on the minimization of $\alpha$-divergences.
no code implementations • 20 Sep 2015 • Dan Stowell, Richard E. Turner
Training a denoising autoencoder neural network requires access to truly clean data, a requirement which is often impractical.
no code implementations • NeurIPS 2015 • Yingzhen Li, Jose Miguel Hernandez-Lobato, Richard E. Turner
Expectation propagation (EP) is a deterministic approximation algorithm that is often used to perform approximate Bayesian parameter learning.
no code implementations • NeurIPS 2015 • Shixiang Gu, Zoubin Ghahramani, Richard E. Turner
Experiments indicate that NASMC significantly improves inference in a non-linear state space model outperforming adaptive proposal methods including the Extended Kalman and Unscented Particle Filters.
no code implementations • 27 Apr 2015 • Alexander G. de G. Matthews, James Hensman, Richard E. Turner, Zoubin Ghahramani
We then discuss augmented index sets and show that, contrary to previous works, marginal consistency of augmentation is not enough to guarantee consistency of variational inference with the original model.
no code implementations • NeurIPS 2014 • Thang D. Bui, Richard E. Turner
Gaussian process regression can be accelerated by constructing a small pseudo-dataset to summarise the observed data.
no code implementations • 23 Nov 2014 • Avid M. Afzal, Hamse Y. Mussa, Richard E. Turner, Andreas Bender, Robert C. Glen
According to Cobanoglu et al and Murphy, it is now widely acknowledged that the single target paradigm (one protein or target, one disease, one drug) that has been the dominant premise in drug development in the recent past is untenable.