Search Results for author: Victoria Krakovna

Found 9 papers, 2 papers with code

REALab: An Embedded Perspective on Tampering

no code implementations17 Nov 2020 Ramana Kumar, Jonathan Uesato, Richard Ngo, Tom Everitt, Victoria Krakovna, Shane Legg

Standard Markov Decision Process (MDP) formulations of RL and simulated environments mirroring the MDP structure assume secure access to feedback (e. g., rewards).

Avoiding Tampering Incentives in Deep RL via Decoupled Approval

no code implementations17 Nov 2020 Jonathan Uesato, Ramana Kumar, Victoria Krakovna, Tom Everitt, Richard Ngo, Shane Legg

How can we design agents that pursue a given objective when all feedback mechanisms are influenceable by the agent?

Avoiding Side Effects By Considering Future Tasks

no code implementations NeurIPS 2020 Victoria Krakovna, Laurent Orseau, Richard Ngo, Miljan Martic, Shane Legg

To avoid this interference incentive, we introduce a baseline policy that represents a default course of action (such as doing nothing), and use it to filter out future tasks that are not achievable by default.

Modeling AGI Safety Frameworks with Causal Influence Diagrams

no code implementations20 Jun 2019 Tom Everitt, Ramana Kumar, Victoria Krakovna, Shane Legg

Proposals for safe AGI systems are typically made at the level of frameworks, specifying how the components of the proposed system should be trained and interact with each other.

Penalizing side effects using stepwise relative reachability

no code implementations4 Jun 2018 Victoria Krakovna, Laurent Orseau, Ramana Kumar, Miljan Martic, Shane Legg

How can we design safe reinforcement learning agents that avoid unnecessary disruptions to their environment?

Safe Reinforcement Learning

AI Safety Gridworlds

2 code implementations27 Nov 2017 Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, Shane Legg

We present a suite of reinforcement learning environments illustrating various safety properties of intelligent agents.

Safe Exploration

Reinforcement Learning with a Corrupted Reward Channel

1 code implementation23 May 2017 Tom Everitt, Victoria Krakovna, Laurent Orseau, Marcus Hutter, Shane Legg

Traditional RL methods fare poorly in CRMDPs, even under strong simplifying assumptions and when trying to compensate for the possibly corrupt rewards.

Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input

no code implementations COLING 2016 Cory Shain, William Bryce, Lifeng Jin, Victoria Krakovna, Finale Doshi-Velez, Timothy Miller, William Schuler, Lane Schwartz

This paper presents a new memory-bounded left-corner parsing model for unsupervised raw-text syntax induction, using unsupervised hierarchical hidden Markov models (UHHMM).

Language Acquisition

Cannot find the paper you are looking for? You can Submit a new open access paper.