no code implementations • 11 Dec 2024 • Hao Kang, Srikant Bharadwaj, James Hensman, Tushar Krishna, Victor Ruhle, Saravan Rajmohan
Our solution introduces two key innovations: FlashQ, a headwise attention quantization technique that enables both compression of KV cache and quantized execution of activation-activation multiplication, and Sparsity-based Softmax Approximation (SAS), which eliminates the need for dequantization to FP32 during exponentiation operation in attention.
no code implementations • 10 Dec 2024 • Meyer Scetbon, James Hensman
We consider the problem of model compression for Large Language Models (LLMs) at post-training time, where the task is to compress a well-trained model using only a small set of calibration input data.
no code implementations • 22 Oct 2024 • Tycho F. A. van der Ouderaa, Maximilian L. Croci, Agrin Hilmkil, James Hensman
In this work, we aim to further exploit this spherical geometry of the weights when performing quantization by considering Pyramid Vector Quantization (PVQ) for large language models.
no code implementations • 14 Oct 2024 • Xi Wang, Liana Mikaelyan, Taketomo Isazawa, James Hensman
In this paper, we propose Knowledge Base augmented Language Model (KBLaM), a new method for augmenting Large Language Models (LLMs) with external knowledge.
3 code implementations • 30 Mar 2024 • Saleh Ashkboos, Amirkeivan Mohtashami, Maximilian L. Croci, Bo Li, Pashmina Cameron, Martin Jaggi, Dan Alistarh, Torsten Hoefler, James Hensman
We introduce QuaRot, a new Quantization scheme based on Rotations, which is able to quantize LLMs end-to-end, including all weights, activations, and KV cache in 4 bits.
1 code implementation • 6 Feb 2024 • Haolun Wu, Ye Yuan, Liana Mikaelyan, Alexander Meulemans, Xue Liu, James Hensman, Bhaskar Mitra
Recent advances in machine learning have significantly impacted the field of information extraction, with Language Models (LMs) playing a pivotal role in extracting structured information from unstructured text.
1 code implementation • 26 Jan 2024 • Saleh Ashkboos, Maximilian L. Croci, Marcelo Gennari do Nascimento, Torsten Hoefler, James Hensman
Large language models have become the cornerstone of natural language processing, but their use comes with substantial costs in terms of compute and memory resources.
no code implementations • 8 Dec 2023 • Ouail Kitouni, Niklas Nolte, James Hensman, Bhaskar Mitra
We introduce Diffusion Models of Structured Knowledge (DiSK) - a new architecture and training approach specialized for structured data.
no code implementations • 28 Mar 2023 • Stefanos Eleftheriadis, Dominic Richards, James Hensman
Further, we introduce sparseness in the eigenbasis by variational learning of the spherical harmonic phases.
1 code implementation • 20 Jun 2022 • Xiaoyu Lu, Alexis Boukouvalas, James Hensman
Gaussian Process (GP) models are a class of flexible non-parametric models that have rich representational power.
no code implementations • pproximateinference AABI Symposium 2022 • Mark van der Wilk, Artem Artemev, James Hensman
The need for matrix decompositions (inverses) is often named as a major impediment to scaling Gaussian process (GP) models, even in efficient approximations.
no code implementations • NeurIPS 2021 • Vincent Dutordoir, James Hensman, Mark van der Wilk, Carl Henrik Ek, Zoubin Ghahramani, Nicolas Durrande
This results in models that can either be seen as neural networks with improved uncertainty prediction or deep Gaussian processes with increased prediction accuracy.
1 code implementation • 12 Apr 2021 • Vincent Dutordoir, Hugh Salimbeni, Eric Hambro, John McLeod, Felix Leibfried, Artem Artemev, Mark van der Wilk, James Hensman, Marc P. Deisenroth, ST John
GPflux is compatible with and built on top of the Keras deep learning eco-system.
no code implementations • ICML 2020 • Vincent Dutordoir, Nicolas Durrande, James Hensman
We introduce a new class of inter-domain variational Gaussian processes (GP) where data is mapped onto the unit hypersphere in order to use spherical harmonic representations.
no code implementations • 9 Mar 2020 • Ayman Boustati, Sattar Vakili, James Hensman, ST John
Approximate inference in complex probabilistic models such as deep Gaussian processes requires the optimisation of doubly stochastic objective functions.
1 code implementation • 2 Mar 2020 • Mark van der Wilk, Vincent Dutordoir, ST John, Artem Artemev, Vincent Adam, James Hensman
One obstacle to the use of Gaussian processes (GPs) in large-scale problems, and as a component in deep learning system, is the need for bespoke derivations and implementations for small variations in the model or inference.
no code implementations • 15 Jan 2020 • Vincent Adam, Stefanos Eleftheriadis, Nicolas Durrande, Artem Artemev, James Hensman
The use of Gaussian process models is typically limited to datasets with a few tens of thousands of observations due to their complexity and memory footprint.
no code implementations • pproximateinference AABI Symposium 2019 • Mark van der Wilk, ST John, Artem Artemev, James Hensman
We present a variational approximation for a wide range of GP models that does not require a matrix inverse to be performed at each optimisation step.
1 code implementation • 13 Jun 2019 • Alessandro Davide Ialongo, Mark van der Wilk, James Hensman, Carl Edward Rasmussen
As we demonstrate in our experiments, the factorisation between latent system states and transition function can lead to a miscalibrated posterior and to learning unnecessarily large noise terms.
1 code implementation • 14 May 2019 • Hugh Salimbeni, Vincent Dutordoir, James Hensman, Marc Peter Deisenroth
Deep Gaussian processes (DGPs) can model complex marginal densities as well as complex mappings.
no code implementations • 26 Feb 2019 • Nicolas Durrande, Vincent Adam, Lucas Bordeaux, Stefanos Eleftheriadis, James Hensman
Banded matrices can be used as precision matrices in several models including linear state-space models, some Gaussian processes, and Gaussian Markov random fields.
no code implementations • 15 Feb 2019 • Vincent Dutordoir, Mark van der Wilk, Artem Artemev, James Hensman
We also demonstrate that our fully Bayesian approach improves on dropout-based Bayesian deep learning methods in terms of uncertainty and marginal likelihood estimates.
no code implementations • 14 Dec 2018 • Alessandro Davide Ialongo, Mark van der Wilk, James Hensman, Carl Edward Rasmussen
We focus on variational inference in dynamical systems where the discrete time transition function (or evolution rule) is modelled by a Gaussian process.
1 code implementation • NeurIPS 2018 • Arno Solin, James Hensman, Richard E. Turner
The complexity is still cubic in the state dimension $m$ which is an impediment to practical application.
no code implementations • NeurIPS 2018 • Vincent Dutordoir, Hugh Salimbeni, Marc Deisenroth, James Hensman
Conditional Density Estimation (CDE) models deal with estimating conditional distributions.
no code implementations • NeurIPS 2018 • Mark van der Wilk, Matthias Bauer, ST John, James Hensman
Generalising well in supervised learning tasks relies on correctly extrapolating the training data to a large region of the input space.
no code implementations • ICML 2018 • S. T. John, James Hensman
This allows us to formulate a grid-free approximation that scales well with the number of data points and the size of the domain.
no code implementations • 24 Mar 2018 • Hugh Salimbeni, Stefanos Eleftheriadis, James Hensman
The natural gradient method has been used effectively in conjugate Gaussian process models, but the non-conjugate case has been largely unexplored.
4 code implementations • NeurIPS 2017 • Mark van der Wilk, Carl Edward Rasmussen, James Hensman
We present a practical way of introducing convolutional structure into Gaussian processes, making them more suited to high-dimensional inputs like images.
1 code implementation • NeurIPS 2019 • Christopher Nemeth, Fredrik Lindsten, Maurizio Filippone, James Hensman
In this paper, we introduce the pseudo-extended MCMC method as a simple approach for improving the mixing of the MCMC sampler for multi-modal posterior distributions.
no code implementations • 16 Aug 2017 • Hossein Soleimani, James Hensman, Suchi Saria
Alternatively, state-of-the-art joint modeling techniques can be used for jointly modeling the longitudinal and event data and compute event probabilities conditioned on the longitudinal observations.
no code implementations • NeurIPS 2017 • Stefanos Eleftheriadis, Thomas F. W. Nicholson, Marc Peter Deisenroth, James Hensman
To address this challenge, we impose a structured Gaussian variational posterior distribution over the latent states, which is parameterised by a recognition model in the form of a bi-directional recurrent neural network.
1 code implementation • 21 Nov 2016 • James Hensman, Nicolas Durrande, Arno Solin
This work brings together two powerful concepts in Gaussian processes: the variational approach to sparse approximation and the spectral representation of Gaussian processes.
1 code implementation • 27 Oct 2016 • Alexander G. de G. Matthews, Mark van der Wilk, Tom Nickson, Keisuke Fujii, Alexis Boukouvalas, Pablo León-Villagrá, Zoubin Ghahramani, James Hensman
GPflow is a Gaussian process library that uses TensorFlow for its core computations and Python for its front end.
1 code implementation • 18 Apr 2016 • Alan D. Saul, James Hensman, Aki Vehtari, Neil D. Lawrence
Gaussian process models are flexible, Bayesian non-parametric approaches to regression.
no code implementations • NeurIPS 2015 • James Hensman, Alexander G. de G. Matthews, Maurizio Filippone, Zoubin Ghahramani
This paper simultaneously addresses these, using a variational approximation to the posterior which is sparse in support of the function but otherwise free-form.
no code implementations • 10 May 2015 • Zhenwen Dai, James Hensman, Neil Lawrence
The Gaussian process latent variable model (GP-LVM) is a popular approach to non-linear probabilistic dimensionality reduction.
no code implementations • 27 Apr 2015 • Alexander G. de G. Matthews, James Hensman, Richard E. Turner, Zoubin Ghahramani
We then discuss augmented index sets and show that, contrary to previous works, marginal consistency of augmentation is not enough to guarantee consistency of variational inference with the original model.
no code implementations • 3 Dec 2014 • James Hensman, Neil D. Lawrence
Deep Gaussian processes provide a flexible approach to probabilistic modelling of data using either supervised or unsupervised learning.
1 code implementation • 7 Nov 2014 • James Hensman, Alex Matthews, Zoubin Ghahramani
Gaussian process classification is a popular method with a number of appealing properties.
no code implementations • 18 Oct 2014 • Zhenwen Dai, Andreas Damianou, James Hensman, Neil Lawrence
In this work, we present an extension of Gaussian process (GP) models with sophisticated parallelization and GPU acceleration.
no code implementations • 8 Jan 2014 • James Hensman, Magnus Rattray, Neil D. Lawrence
In this publication, we combine two Bayesian non-parametric models: the Gaussian Process (GP) and the Dirichlet Process (DP).
9 code implementations • 26 Sep 2013 • James Hensman, Nicolo Fusi, Neil D. Lawrence
We introduce stochastic variational inference for Gaussian process models.
no code implementations • NeurIPS 2012 • James Hensman, Magnus Rattray, Neil D. Lawrence
We present a general method for deriving collapsed variational inference algorithms for probabilistic models in the conjugate exponential family.