no code implementations • 27 Feb 2025 • Karolina Stańczak, Nicholas Meade, Mehar Bhatia, Hattie Zhou, Konstantin Böttinger, Jeremy Barnes, Jason Stanley, Jessica Montgomery, Richard Zemel, Nicolas Papernot, Nicolas Chapados, Denis Therien, Timothy P. Lillicrap, Ana Marasović, Sylvie Delacroix, Gillian K. Hadfield, Siva Reddy
Recent progress in large language models (LLMs) has focused on producing responses that meet human expectations and align with shared values - a process coined alignment.
2 code implementations • 1 May 2024 • Yuxi Xie, Anirudh Goyal, Wenyue Zheng, Min-Yen Kan, Timothy P. Lillicrap, Kenji Kawaguchi, Michael Shieh
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process inspired by the successful strategy employed by AlphaZero.
1 code implementation • NeurIPS 2021 • Roman Pogodin, Yash Mehta, Timothy P. Lillicrap, Peter E. Latham
This requires the network to pause occasionally for a sleep-like phase of "weight sharing".
1 code implementation • NeurIPS 2020 • Chongli Qin, Yan Wu, Jost Tobias Springenberg, Andrew Brock, Jeff Donahue, Timothy P. Lillicrap, Pushmeet Kohli
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error in discretising the continuous dynamics.
6 code implementations • ICLR 2020 • Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap
We present the Compressive Transformer, an attentive sequence model which compresses past memories for long-range sequence learning.
Ranked #2 on
Language Modelling
on Hutter Prize
no code implementations • ICLR 2020 • Sergey Bartunov, Jack W. Rae, Simon Osindero, Timothy P. Lillicrap
We study the problem of learning associative memory -- a system which is able to retrieve a remembered pattern based on its distorted or incomplete version.
no code implementations • 27 Sep 2019 • Sebastien Racaniere, Andrew K. Lampinen, Adam Santoro, David P. Reichert, Vlad Firoiu, Timothy P. Lillicrap
We demonstrate the success of our approach in rich but sparsely rewarding 2D and 3D environments, where an agent is tasked to achieve a single goal selected from a set of possible goals that varies between episodes, and identify challenges for future work.
no code implementations • 15 Jul 2019 • Timothy P. Lillicrap, Konrad P. Kording
In analogy, we conjecture that rules for development and learning in brains may be far easier to understand than their resulting properties.
no code implementations • ICLR 2019 • Jack W. Rae, Sergey Bartunov, Timothy P. Lillicrap
There has been a recent trend in training neural networks to replace data structures that have been crafted by hand, with an aim for faster execution, better accuracy, or greater compression.
no code implementations • 5 Dec 2018 • Jonathan J. Hunt, Andre Barreto, Timothy P. Lillicrap, Nicolas Heess
Composing previously mastered skills to solve novel tasks promises dramatic improvements in the data efficiency of reinforcement learning.
no code implementations • ICLR 2019 • David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy P. Lillicrap, Greg Wayne
We examine this issue in the context of reinforcement learning, in a setting where an agent is exposed to tasks in a sequence.
no code implementations • ICML 2018 • Jack W. Rae, Chris Dyer, Peter Dayan, Timothy P. Lillicrap
Neural networks trained with backpropagation often struggle to identify classes that have been observed a small number of times.
Ranked #68 on
Language Modelling
on WikiText-103
no code implementations • ICML 2017 • Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Timothy P. Lillicrap, Matt Botvinick, Nando de Freitas
We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent.
no code implementations • NeurIPS 2016 • Jack W. Rae, Jonathan J. Hunt, Tim Harley, Ivo Danihelka, Andrew Senior, Greg Wayne, Alex Graves, Timothy P. Lillicrap
SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring $100,\! 000$s of time steps and memories.
Ranked #6 on
Question Answering
on bAbi
(Mean Error Rate metric)
1 code implementation • 1 Oct 2016 • Jordan Guergiuev, Timothy P. Lillicrap, Blake A. Richards
Deep learning has led to significant advances in artificial intelligence, in part, by adopting strategies motivated by neurophysiology.
70 code implementations • 4 Feb 2016 • Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu
We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers.
Ranked #9 on
Atari Games
on Atari 2600 Star Gunner
3 code implementations • 14 Dec 2015 • Nicolas Heess, Jonathan J. Hunt, Timothy P. Lillicrap, David Silver
Partially observed control problems are a challenging aspect of reinforcement learning.
161 code implementations • 9 Sep 2015 • Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra
We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain.
Ranked #2 on
OpenAI Gym
on HalfCheetah-v4
1 code implementation • 2 Nov 2014 • Timothy P. Lillicrap, Daniel Cownden, Douglas B. Tweed, Colin J. Akerman
In machine learning, the backpropagation algorithm assigns blame to a neuron by computing exactly how it contributed to an error.