1 code implementation • 4 Mar 2022 • Long Ouyang, Jeff Wu, Xu Jiang, Diogo Almeida, Carroll L. Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, John Schulman, Jacob Hilton, Fraser Kelton, Luke Miller, Maddie Simens, Amanda Askell, Peter Welinder, Paul Christiano, Jan Leike, Ryan Lowe
In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback.
no code implementations • 20 Jan 2022 • Matthew Rahtz, Vikrant Varma, Ramana Kumar, Zachary Kenton, Shane Legg, Jan Leike
In this paper we answer this question in the affirmative, using ReQueST to train an agent to perform a 3D first-person object collection task using data entirely from human contractors.
no code implementations • 29 Sep 2021 • David Krueger, Tegan Maharaj, Jan Leike
We use these unit tests to demonstrate that changes to the learning algorithm (e. g. introducing meta-learning) can cause previously hidden incentives to be revealed, resulting in qualitatively different behaviour despite no change in performance metric.
no code implementations • 22 Sep 2021 • Jeff Wu, Long Ouyang, Daniel M. Ziegler, Nisan Stiennon, Ryan Lowe, Jan Leike, Paul Christiano
Our human labelers are able to supervise and evaluate the models quickly, despite not having read the entire books themselves.
7 code implementations • 7 Jul 2021 • Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, Wojciech Zaremba
We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities.
Ranked #1 on
Code Generation
on APPS
no code implementations • 30 May 2021 • Carina Prunkl, Carolyn Ashurst, Markus Anderljung, Helena Webb, Jan Leike, Allan Dafoe
In 2020, the Conference on Neural Information Processing Systems (NeurIPS) introduced a requirement for submitting authors to include a statement on the broader societal impacts of their research.
no code implementations • 13 Nov 2020 • David Krueger, Jan Leike, Owain Evans, John Salvatier
Active reinforcement learning (ARL) is a variant on reinforcement learning where the agent does not observe the reward unless it chooses to pay a query cost c > 0.
no code implementations • 19 Sep 2020 • David Krueger, Tegan Maharaj, Jan Leike
We introduce the term auto-induced distributional shift (ADS) to describe the phenomenon of an algorithm causing a change in the distribution of its own inputs.
1 code implementation • ICLR 2021 • Adam Gleave, Michael Dennis, Shane Legg, Stuart Russell, Jan Leike
However, this method cannot distinguish between the learned reward function failing to reflect user preferences and the policy optimization process failing to optimize the learned reward.
no code implementations • 28 Apr 2020 • Stuart Armstrong, Jan Leike, Laurent Orseau, Shane Legg
We formally introduce two desirable properties: the first is `unriggability', which prevents the agent from steering the learning process in the direction of a reward function that is easier to optimise.
1 code implementation • ICML 2020 • Siddharth Reddy, Anca D. Dragan, Sergey Levine, Shane Legg, Jan Leike
To address this challenge, we propose an algorithm that safely and interactively learns a model of the user's reward function.
no code implementations • 25 Sep 2019 • David Scott Krueger, Tegan Maharaj, Shane Legg, Jan Leike
Decisions made by machine learning systems have increasing influence on the world.
no code implementations • ICLR 2019 • Miljan Martic, Jan Leike, Andrew Trask, Matteo Hessel, Shane Legg, Pushmeet Kohli
Currently the only techniques for sharing governance of a deep learning model are homomorphic encryption and secure multiparty computation.
3 code implementations • 19 Nov 2018 • Jan Leike, David Krueger, Tom Everitt, Miljan Martic, Vishal Maini, Shane Legg
One obstacle to applying reinforcement learning algorithms to real-world problems is the lack of suitable reward functions.
2 code implementations • NeurIPS 2018 • Borja Ibarz, Jan Leike, Tobias Pohlen, Geoffrey Irving, Shane Legg, Dario Amodei
To solve complex real-world problems with reinforcement learning, we cannot rely on manually specified reward functions.
1 code implementation • ICLR 2019 • Dzmitry Bahdanau, Felix Hill, Jan Leike, Edward Hughes, Arian Hosseini, Pushmeet Kohli, Edward Grefenstette
Recent work has shown that deep reinforcement-learning agents can learn to follow language-like instructions from infrequent environment rewards.
2 code implementations • 27 Nov 2017 • Jan Leike, Miljan Martic, Victoria Krakovna, Pedro A. Ortega, Tom Everitt, Andrew Lefrancq, Laurent Orseau, Shane Legg
We present a suite of reinforcement learning environments illustrating various safety properties of intelligent agents.
6 code implementations • NeurIPS 2017 • Paul Christiano, Jan Leike, Tom B. Brown, Miljan Martic, Shane Legg, Dario Amodei
For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems.
1 code implementation • 30 May 2017 • John Aslanides, Jan Leike, Marcus Hutter
Many state-of-the-art reinforcement learning (RL) algorithms typically assume that the environment is an ergodic Markov Decision Process (MDP).
1 code implementation • 3 Mar 2017 • Sean Lamont, John Aslanides, Jan Leike, Marcus Hutter
We have added to the GRL simulation platform AIXIjs the functionality to assign an agent arbitrary discount functions, and an environment which can be used to determine the effect of discounting on an agent's policy.
no code implementations • 28 Nov 2016 • Jan Leike
However, there are Bayesian approaches to general RL that satisfy objective optimality guarantees: We prove that Thompson sampling is asymptotically optimal in stochastic environments in the sense that its value converges to the value of the optimal policy.
no code implementations • 16 Sep 2016 • Jan Leike
We introduce exploration potential, a quantity that measures how much a reinforcement learning agent has explored its environment class.
no code implementations • 16 Sep 2016 • Jan Leike, Jessica Taylor, Benya Fallenstein
In this paper we present a formal and general solution to the full grain of truth problem: we construct a class of policies that contains all computable policies as well as Bayes-optimal policies for every lower semicomputable prior over the class.
no code implementations • 12 Apr 2016 • Daniel Filan, Marcus Hutter, Jan Leike
On a polynomial time computable sequence our speed prior is computable in exponential time.
no code implementations • 25 Feb 2016 • Jan Leike, Tor Lattimore, Laurent Orseau, Marcus Hutter
We discuss a variant of Thompson sampling for nonparametric reinforcement learning in a countable classes of general stochastic environments.
no code implementations • 19 Oct 2015 • Jan Leike, Marcus Hutter
Solomonoff induction and the reinforcement learning agent AIXI are proposed answers to this question.
no code implementations • 16 Oct 2015 • Jan Leike, Marcus Hutter
A big open question of algorithmic information theory is the choice of the universal Turing machine (UTM).
no code implementations • 15 Jul 2015 • Jan Leike, Marcus Hutter
Nicod's criterion states that observing a black raven is evidence for the hypothesis H that all ravens are black.
no code implementations • 15 Jul 2015 • Jan Leike, Marcus Hutter
Solomonoff induction is held as a gold standard for learning, but it is known to be incomputable.
no code implementations • 24 Jun 2015 • Tom Everitt, Jan Leike, Marcus Hutter
Moving beyond the dualistic view in AI where agent and environment are separated incurs new challenges for decision making, as calculation of expected utility is no longer straightforward.
no code implementations • 18 May 2015 • Mayank Daswani, Jan Leike
What is happiness for reinforcement learning agents?
no code implementations • 14 Aug 2014 • Jan Leike, Marcus Hutter
We construct a class of nonnegative martingale processes that oscillate indefinitely with high probability.