Search Results for author: Ole Jorgensen

Found 2 papers, 1 papers with code

Improving Activation Steering in Language Models with Mean-Centring

no code implementations6 Dec 2023 Ole Jorgensen, Dylan Cope, Nandi Schoots, Murray Shanahan

Recent work in activation steering has demonstrated the potential to better control the outputs of Large Language Models (LLMs), but it involves finding steering vectors.

Self-Consistency of Large Language Models under Ambiguity

1 code implementation20 Oct 2023 Henning Bartsch, Ole Jorgensen, Domenic Rosati, Jason Hoelscher-Obermaier, Jacob Pfau

Using this test, we find that despite increases in self-consistency, models usually place significant weight on alternative, inconsistent answers.

Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.