no code implementations • • Lei Yu, Laurent Sartran, Po-Sen Huang, Wojciech Stokowiec, Domenic Donato, Srivatsan Srinivasan, Alek Andreev, Wang Ling, Sona Mokra, Agustin Dal Lago, Yotam Doron, Susannah Young, Phil Blunsom, Chris Dyer
This paper describes the DeepMind submission to the Chinese\rightarrowEnglish constrained data track of the WMT2020 Shared Task on News Translation.
The dm_control software package is a collection of Python libraries and task suites for reinforcement learning agents in an articulated-body simulation.
Recent breakthroughs in adversarial generative modeling have led to models capable of producing video samples of high quality, even on large and complex datasets of real-world video.
Ranked #6 on Video Prediction on Kinetics-600 12 frames, 64x64
2 code implementations • • Ian Osband, Yotam Doron, Matteo Hessel, John Aslanides, Eren Sezener, Andre Saraiva, Katrina McKinney, Tor Lattimore, Csaba Szepesvari, Satinder Singh, Benjamin Van Roy, Richard Sutton, David Silver, Hado van Hasselt
bsuite is a collection of carefully-designed experiments that investigate core capabilities of reinforcement learning (RL) agents with two objectives.
In this work, we investigate the impact of the deadly triad in practice, in the context of a family of popular deep reinforcement learning models - deep Q-networks trained with experience replay - analysing how the components of this system play a role in the emergence of the deadly triad, and in the agent's performance
In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters.
Ranked #3 on Atari Games on Atari 2600 Skiing (using extra training data)
7 code implementations • 2 Jan 2018 • Yuval Tassa, Yotam Doron, Alistair Muldal, Tom Erez, Yazhe Li, Diego de Las Casas, David Budden, Abbas Abdolmaleki, Josh Merel, Andrew Lefrancq, Timothy Lillicrap, Martin Riedmiller
The DeepMind Control Suite is a set of continuous control tasks with a standardised structure and interpretable rewards, intended to serve as performance benchmarks for reinforcement learning agents.