Search Results for author: Oliver Daniels-Koch

Found 3 papers, 1 papers with code

CoinRun: Solving Goal Misgeneralisation

no code implementations • 28 Sep 2023 • Stuart Armstrong, Alexandre Maranhão, Oliver Daniels-Koch, Patrick Leask, Rebecca Gorman

Goal misgeneralisation is a key challenge in AI alignment -- the task of getting powerful Artificial Intelligences to align their goals with human intentions and human morality.

Paper
Add Code

The Expertise Problem: Learning from Specialized Feedback

no code implementations • 12 Nov 2022 • Oliver Daniels-Koch, Rachel Freedman

RLHF algorithms that learn from multiple teachers therefore face an expertise problem: the reliability of a given piece of feedback depends both on the teacher that it comes from and how specialized that teacher is on relevant components of the task.

Paper
Add Code

CULT: Continual Unsupervised Learning with Typicality-Based Environment Detection

1 code implementation • 17 Jul 2022 • Oliver Daniels-Koch

We introduce CULT (Continual Unsupervised Representation Learning with Typicality-Based Environment Detection), a new algorithm for continual unsupervised learning with variational auto-encoders.

Representation Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.