Search Results for author: John Aslanides

Found 13 papers, 8 papers with code

Teaching language models to support answers with verified quotes

no code implementations21 Mar 2022 Jacob Menick, Maja Trebacz, Vladimir Mikulik, John Aslanides, Francis Song, Martin Chadwick, Mia Glaese, Susannah Young, Lucy Campbell-Gillingham, Geoffrey Irving, Nat McAleese

We measure the performance of GopherCite by conducting human evaluation of answers to questions in a subset of the NaturalQuestions and ELI5 datasets.

Fact Checking

Red Teaming Language Models with Language Models

no code implementations7 Feb 2022 Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, Geoffrey Irving

In this work, we automatically find cases where a target LM behaves in a harmful way, by generating test cases ("red teaming") using another LM.

Chatbot Language Modelling +1

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

no code implementations NA 2021 Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Fact Checking Language Modelling +4

Divide-and-Conquer Monte Carlo Tree Search

no code implementations1 Jan 2021 Giambattista Parascandolo, Lars Holger Buesing, Josh Merel, Leonard Hasenclever, John Aslanides, Jessica B Hamrick, Nicolas Heess, Alexander Neitz, Theophane Weber

are constrained by an implicit sequential planning assumption: The order in which a plan is constructed is the same in which it is executed.

Continuous Control Decision Making

Acme: A Research Framework for Distributed Reinforcement Learning

3 code implementations1 Jun 2020 Matt Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang, Kate Baumli, Sarah Henderson, Alex Novikov, Sergio Gómez Colmenarejo, Serkan Cabi, Caglar Gulcehre, Tom Le Paine, Andrew Cowie, Ziyu Wang, Bilal Piot, Nando de Freitas

Ultimately, we show that the design decisions behind Acme lead to agents that can be scaled both up and down and that, for the most part, greater levels of parallelization result in agents with equivalent performance, just faster.

DQN Replay Dataset reinforcement-learning

When to use parametric models in reinforcement learning?

2 code implementations NeurIPS 2019 Hado van Hasselt, Matteo Hessel, John Aslanides

We examine the question of when and how parametric models are most useful in reinforcement learning.

reinforcement-learning

TF-Replicator: Distributed Machine Learning for Researchers

1 code implementation1 Feb 2019 Peter Buchlovsky, David Budden, Dominik Grewe, Chris Jones, John Aslanides, Frederic Besse, Andy Brock, Aidan Clark, Sergio Gómez Colmenarejo, Aedan Pope, Fabio Viola, Dan Belov

We describe TF-Replicator, a framework for distributed machine learning designed for DeepMind researchers and implemented as an abstraction over TensorFlow.

Continuous Control Image Generation +1

Universal Reinforcement Learning Algorithms: Survey and Experiments

1 code implementation30 May 2017 John Aslanides, Jan Leike, Marcus Hutter

Many state-of-the-art reinforcement learning (RL) algorithms typically assume that the environment is an ergodic Markov Decision Process (MDP).

reinforcement-learning

AIXIjs: A Software Demo for General Reinforcement Learning

1 code implementation22 May 2017 John Aslanides

The universal Bayesian agent AIXI (Hutter, 2005) is a model of a maximally intelligent agent, and plays a central role in the sub-field of general reinforcement learning (GRL).

General Reinforcement Learning OpenAI Gym +1

Generalised Discount Functions applied to a Monte-Carlo AImu Implementation

1 code implementation3 Mar 2017 Sean Lamont, John Aslanides, Jan Leike, Marcus Hutter

We have added to the GRL simulation platform AIXIjs the functionality to assign an agent arbitrary discount functions, and an environment which can be used to determine the effect of discounting on an agent's policy.

General Reinforcement Learning reinforcement-learning

Cannot find the paper you are looking for? You can Submit a new open access paper.