no code implementations • 20 Feb 2024 • Adam X. Yang, Maxime Robeyns, Thomas Coste, Jun Wang, Haitham Bou-Ammar, Laurence Aitchison
To ensure that large language model (LLM) responses are helpful and non-toxic, we usually fine-tune a reward model on human preference data.
2 code implementations • 24 Aug 2023 • Adam X. Yang, Maxime Robeyns, Xi Wang, Laurence Aitchison
Low-rank adaptation (LoRA) has emerged as a new paradigm for cost-efficient fine-tuning of large language models (LLMs).
no code implementations • 30 Aug 2021 • Adam X. Yang, Maxime Robeyns, Edward Milsom, Ben Anson, Nandi Schoots, Laurence Aitchison
In particular, we show that Deep Gaussian processes (DGPs) in the Bayesian representation learning limit have exactly multivariate Gaussian posteriors, and the posterior covariances can be obtained by optimizing an interpretable objective combining a log-likelihood to improve performance with a series of KL-divergences which keep the posteriors close to the prior.