Search Results for author: Miljan Martic

Found 9 papers, 4 papers with code

Causal Analysis of Agent Behavior for AI Safety

no code implementations5 Mar 2021 Grégoire Déletang, Jordi Grau-Moya, Miljan Martic, Tim Genewein, Tom McGrath, Vladimir Mikulik, Markus Kunesch, Shane Legg, Pedro A. Ortega

As machine learning systems become more powerful they also become increasingly unpredictable and opaque.

Meta-trained agents implement Bayes-optimal agents

no code implementations NeurIPS 2020 Vladimir Mikulik, Grégoire Delétang, Tom McGrath, Tim Genewein, Miljan Martic, Shane Legg, Pedro A. Ortega

Memory-based meta-learning is a powerful technique to build agents that adapt fast to any task within a target distribution.


Avoiding Side Effects By Considering Future Tasks

no code implementations NeurIPS 2020 Victoria Krakovna, Laurent Orseau, Richard Ngo, Miljan Martic, Shane Legg

To avoid this interference incentive, we introduce a baseline policy that represents a default course of action (such as doing nothing), and use it to filter out future tasks that are not achievable by default.

Scaling shared model governance via model splitting

no code implementations ICLR 2019 Miljan Martic, Jan Leike, Andrew Trask, Matteo Hessel, Shane Legg, Pushmeet Kohli

Currently the only techniques for sharing governance of a deep learning model are homomorphic encryption and secure multiparty computation.

Scalable agent alignment via reward modeling: a research direction

3 code implementations19 Nov 2018 Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg

One obstacle to applying reinforcement learning algorithms to real-world problems is the lack of suitable reward functions.

Atari Games

Penalizing side effects using stepwise relative reachability

no code implementations4 Jun 2018 Victoria Krakovna, Laurent Orseau, Ramana Kumar, Miljan Martic, Shane Legg

How can we design safe reinforcement learning agents that avoid unnecessary disruptions to their environment?

Safe Reinforcement Learning

AI Safety Gridworlds

2 code implementations27 Nov 2017 Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, Shane Legg

We present a suite of reinforcement learning environments illustrating various safety properties of intelligent agents.

Safe Exploration

Deep reinforcement learning from human preferences

6 code implementations NeurIPS 2017 Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei

For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems.

Atari Games

Cannot find the paper you are looking for? You can Submit a new open access paper.