1 code implementation • 18 Jul 2024 • Jan Hendrik Kirchner, Yining Chen, Harri Edwards, Jan Leike, Nat McAleese, Yuri Burda
One way to increase confidence in the outputs of Large Language Models (LLMs) is to support them with reasoning that is clear and easy to check -- a property we call legibility.
6 code implementations • 6 Jan 2022 • Alethea Power, Yuri Burda, Harri Edwards, Igor Babuschkin, Vedant Misra
In this paper we propose to study generalization of neural networks on small algorithmically generated datasets.
13 code implementations • 7 Jul 2021 • Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter, Philippe Tillet, Felipe Petroski Such, Dave Cummings, Matthias Plappert, Fotios Chantzis, Elizabeth Barnes, Ariel Herbert-Voss, William Hebgen Guss, Alex Nichol, Alex Paino, Nikolas Tezak, Jie Tang, Igor Babuschkin, Suchir Balaji, Shantanu Jain, William Saunders, Christopher Hesse, Andrew N. Carr, Jan Leike, Josh Achiam, Vedant Misra, Evan Morikawa, Alec Radford, Matthew Knight, Miles Brundage, Mira Murati, Katie Mayer, Peter Welinder, Bob McGrew, Dario Amodei, Sam McCandlish, Ilya Sutskever, Wojciech Zaremba
We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities.
Ranked #1 on Multi-task Language Understanding on BBH-alg
21 code implementations • ICLR 2019 • Yuri Burda, Harrison Edwards, Amos Storkey, Oleg Klimov
In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods.
5 code implementations • ICLR 2019 • Yuri Burda, Harri Edwards, Deepak Pathak, Amos Storkey, Trevor Darrell, Alexei A. Efros
However, annotating each environment with hand-designed, dense rewards is not scalable, motivating the need for developing reward functions that are intrinsic to the agent.
Ranked #14 on Atari Games on Atari 2600 Montezuma's Revenge
1 code implementation • ICLR 2018 • Maruan Al-Shedivat, Trapit Bansal, Yuri Burda, Ilya Sutskever, Igor Mordatch, Pieter Abbeel
Ability to continuously learn and adapt from limited experience in nonstationary environments is an important milestone on the path towards general intelligence.
2 code implementations • 14 Nov 2016 • Yuhuai Wu, Yuri Burda, Ruslan Salakhutdinov, Roger Grosse
The past several years have seen remarkable progress in generative models which produce convincing samples of images and other modalities.
24 code implementations • 1 Sep 2015 • Yuri Burda, Roger Grosse, Ruslan Salakhutdinov
The variational autoencoder (VAE; Kingma, Welling (2014)) is a recently proposed generative model pairing a top-down generative network with a bottom-up recognition network which approximates posterior inference.
no code implementations • 30 Dec 2014 • Yuri Burda, Roger B. Grosse, Ruslan Salakhutdinov
Markov random fields (MRFs) are difficult to evaluate as generative models because computing the test log-probabilities requires the intractable partition function.