no code implementations • 23 Sep 2021 • Scott Garrabrant
We propose a new approach to temporal inference, inspired by the Pearlian causal inference paradigm - though quite different from Pearl's approach formally.
no code implementations • 5 Jun 2019 • Evan Hubinger, Chris van Merwijk, Vladimir Mikulik, Joar Skalse, Scott Garrabrant
We analyze the type of learned optimization that occurs when a learned model (such as a neural network) is itself an optimizer - a situation we refer to as mesa-optimization, a neologism we introduce in this paper.
no code implementations • 25 Feb 2019 • Abram Demski, Scott Garrabrant
Traditional models of rational action treat the agent as though it is cleanly separated from its environment, and can act on that environment from the outside.
no code implementations • 13 Mar 2018 • David Manheim, Scott Garrabrant
There are several distinct failure modes for overoptimization of systems on the basis of metrics.
no code implementations • 12 Sep 2016 • Scott Garrabrant, Tsvi Benson-Tilsen, Andrew Critch, Nate Soares, Jessica Taylor
For instance, if the language is Peano arithmetic, it assigns probabilities to all arithmetical statements, including claims about the twin prime conjecture, the outputs of long-running computations, and its own probabilities.
no code implementations • 18 Apr 2016 • Scott Garrabrant, Nate Soares, Jessica Taylor
We study the problem of predicting the results of computations that are too expensive to run, via the observation of the results of smaller computations.
no code implementations • 18 Apr 2016 • Scott Garrabrant, Benya Fallenstein, Abram Demski, Nate Soares
While probability theory is normally applied to external environments, there has been some recent interest in probabilistic modeling of the outputs of computations that are too expensive to run.
no code implementations • 12 Oct 2015 • Scott Garrabrant, Siddharth Bhaskar, Abram Demski, Joanna Garrabrant, George Koleszarik, Evan Lloyd
We give an algorithm A which assigns probabilities to logical sentences.