Search Results for author: Nandi Schoots

Found 8 papers, 3 papers with code

Extending Activation Steering to Broad Skills and Multiple Behaviours

1 code implementation • 9 Mar 2024 • Teun van der Weij, Massimo Poesio, Nandi Schoots

In this paper, we investigate the efficacy of activation steering for broad skills and multiple behaviours.

Paper
Code

Dissecting Language Models: Machine Unlearning via Selective Pruning

no code implementations • 2 Mar 2024 • Nicholas Pochinkov, Nandi Schoots

Understanding and shaping the behaviour of Large Language Models (LLMs) is increasingly important as applications become more powerful and more frequently adopted.

Machine Unlearning

Paper
Add Code

Improving Activation Steering in Language Models with Mean-Centring

no code implementations • 6 Dec 2023 • Ole Jorgensen, Dylan Cope, Nandi Schoots, Murray Shanahan

Recent work in activation steering has demonstrated the potential to better control the outputs of Large Language Models (LLMs), but it involves finding steering vectors.

Paper
Add Code

Comparing Optimization Targets for Contrast-Consistent Search

1 code implementation • 1 Nov 2023 • Hugo Fry, Seamus Fallows, Ian Fan, Jamie Wright, Nandi Schoots

We investigate the optimization target of Contrast-Consistent Search (CCS), which aims to recover the internal representations of truth of a large language model.

Language Modelling Large Language Model

Paper
Code

Any Deep ReLU Network is Shallow

no code implementations • 20 Jun 2023 • Mattia Jacopo Villani, Nandi Schoots

We constructively prove that every deep ReLU network can be rewritten as a functionally identical three-layer network with weights valued in the extended reals.

Paper
Add Code

Low-Entropy Latent Variables Hurt Out-of-Distribution Performance

no code implementations • 20 May 2023 • Nandi Schoots, Dylan Cope

We study the relationship between the entropy of intermediate representations and a model's robustness to distributional shift.

Contrastive Learning

Paper
Add Code

A theory of representation learning gives a deep generalisation of kernel methods

no code implementations • 30 Aug 2021 • Adam X. Yang, Maxime Robeyns, Edward Milsom, Ben Anson, Nandi Schoots, Laurence Aitchison

In particular, we show that Deep Gaussian processes (DGPs) in the Bayesian representation learning limit have exactly multivariate Gaussian posteriors, and the posterior covariances can be obtained by optimizing an interpretable objective combining a log-likelihood to improve performance with a series of KL-divergences which keep the posteriors close to the prior.

Bayesian Inference Gaussian Processes +1

Paper
Add Code

Learning to Communicate with Strangers via Channel Randomisation Methods

1 code implementation • 19 Apr 2021 • Dylan Cope, Nandi Schoots

We introduce two methods for improving the performance of agents meeting for the first time to accomplish a communicative task.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.