Search Results for author: Oliver Daniels-Koch

Found 3 papers, 1 papers with code

CoinRun: Solving Goal Misgeneralisation

no code implementations28 Sep 2023 Stuart Armstrong, Alexandre Maranhão, Oliver Daniels-Koch, Patrick Leask, Rebecca Gorman

Goal misgeneralisation is a key challenge in AI alignment -- the task of getting powerful Artificial Intelligences to align their goals with human intentions and human morality.

The Expertise Problem: Learning from Specialized Feedback

no code implementations12 Nov 2022 Oliver Daniels-Koch, Rachel Freedman

RLHF algorithms that learn from multiple teachers therefore face an expertise problem: the reliability of a given piece of feedback depends both on the teacher that it comes from and how specialized that teacher is on relevant components of the task.

CULT: Continual Unsupervised Learning with Typicality-Based Environment Detection

1 code implementation17 Jul 2022 Oliver Daniels-Koch

We introduce CULT (Continual Unsupervised Representation Learning with Typicality-Based Environment Detection), a new algorithm for continual unsupervised learning with variational auto-encoders.

Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.