Search Results for author: Scott Garrabrant

Found 8 papers, 0 papers with code

Temporal Inference with Finite Factored Sets

no code implementations23 Sep 2021 Scott Garrabrant

We propose a new approach to temporal inference, inspired by the Pearlian causal inference paradigm - though quite different from Pearl's approach formally.

Causal Inference

Risks from Learned Optimization in Advanced Machine Learning Systems

no code implementations5 Jun 2019 Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, Scott Garrabrant

We analyze the type of learned optimization that occurs when a learned model (such as a neural network) is itself an optimizer - a situation we refer to as mesa-optimization, a neologism we introduce in this paper.

BIG-bench Machine Learning

Embedded Agency

no code implementations25 Feb 2019 Abram Demski, Scott Garrabrant

Traditional models of rational action treat the agent as though it is cleanly separated from its environment, and can act on that environment from the outside.

Categorizing Variants of Goodhart's Law

no code implementations13 Mar 2018 David Manheim, Scott Garrabrant

There are several distinct failure modes for overoptimization of systems on the basis of metrics.

Logical Induction

no code implementations12 Sep 2016 Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, Nate Soares, Jessica Taylor

For instance, if the language is Peano arithmetic, it assigns probabilities to all arithmetical statements, including claims about the twin prime conjecture, the outputs of long-running computations, and its own probabilities.

Sentence

Asymptotic Convergence in Online Learning with Unbounded Delays

no code implementations18 Apr 2016 Scott Garrabrant, Nate Soares, Jessica Taylor

We study the problem of predicting the results of computations that are too expensive to run, via the observation of the results of smaller computations.

Inductive Coherence

no code implementations18 Apr 2016 Scott Garrabrant, Benya Fallenstein, Abram Demski, Nate Soares

While probability theory is normally applied to external environments, there has been some recent interest in probabilistic modeling of the outputs of computations that are too expensive to run.

Negation Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.