Search Results for author: Josh Merel

Found 29 papers, 11 papers with code

CoMic: Co-Training and Mimicry for Reusable Skills

no code implementations ICML 2020 Leonard Hasenclever, Fabio Pardo, Raia Hadsell, Nicolas Heess, Josh Merel

Finally we show that it is possible to interleave the motion capture tracking with training on complementary tasks, enriching the resulting skill space, and enabling the reuse of skills not well covered by the motion capture data such as getting up from the ground or catching a ball.

Continuous Control reinforcement-learning

Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies

no code implementations ICLR 2022 Dushyant Rao, Fereshteh Sadeghi, Leonard Hasenclever, Markus Wulfmeier, Martina Zambelli, Giulia Vezzani, Dhruva Tirumala, Yusuf Aytar, Josh Merel, Nicolas Heess, Raia Hadsell

We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours, while retaining the flexibility of a continuous latent variable model.

Divergent representations of ethological visual inputs emerge from supervised, unsupervised, and reinforcement learning

no code implementations3 Dec 2021 Grace W. Lindsay, Josh Merel, Tom Mrsic-Flogel, Maneesh Sahani

Artificial neural systems trained using reinforcement, supervised, and unsupervised learning all acquire internal representations of high dimensional input.

reinforcement-learning Transfer Learning

Learning Dynamics Models for Model Predictive Agents

no code implementations29 Sep 2021 Michael Lutter, Leonard Hasenclever, Arunkumar Byravan, Gabriel Dulac-Arnold, Piotr Trochim, Nicolas Heess, Josh Merel, Yuval Tassa

This paper sets out to disambiguate the role of different design choices for learning dynamics models, by comparing their performance to planning with a ground-truth model -- the simulator.

Model-based Reinforcement Learning

From Motor Control to Team Play in Simulated Humanoid Football

1 code implementation25 May 2021 SiQi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds.

Decision Making Imitation Learning +2

Data augmentation for efficient learning from parametric experts

no code implementations NeurIPS 2021 Alexandre Galashov, Josh Merel, Nicolas Heess

We present a simple, yet powerful data-augmentation technique to enable data-efficient learning from parametric experts.

Data Augmentation

Divide-and-Conquer Monte Carlo Tree Search

no code implementations1 Jan 2021 Giambattista Parascandolo, Lars Holger Buesing, Josh Merel, Leonard Hasenclever, John Aslanides, Jessica B Hamrick, Nicolas Heess, Alexander Neitz, Theophane Weber

are constrained by an implicit sequential planning assumption: The order in which a plan is constructed is the same in which it is executed.

Continuous Control Decision Making

Local Search for Policy Iteration in Continuous Control

no code implementations12 Oct 2020 Jost Tobias Springenberg, Nicolas Heess, Daniel Mankowitz, Josh Merel, Arunkumar Byravan, Abbas Abdolmaleki, Jackie Kay, Jonas Degrave, Julian Schrittwieser, Yuval Tassa, Jonas Buchli, Dan Belov, Martin Riedmiller

We demonstrate that additional computation spent on model-based policy improvement during learning can improve data efficiency, and confirm that model-based policy improvement during action selection can also be beneficial.

Continuous Control

Learning to swim in potential flow

1 code implementation30 Sep 2020 Yusheng Jiao, Feng Ling, Sina Heydari, Nicolas Heess, Josh Merel, Eva Kanso

To address the problem of underwater motion planning, we propose a simple model of a three-link fish swimming in a potential flow environment and we use model-free reinforcement learning for shape control.

Motion Planning reinforcement-learning

Critic Regularized Regression

3 code implementations NeurIPS 2020 Ziyu Wang, Alexander Novikov, Konrad Zolna, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Josh Merel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy optimization from large pre-recorded datasets without online environment interaction.

Offline RL reinforcement-learning

RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

2 code implementations24 Jun 2020 Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, Nando de Freitas

We hope that our suite of benchmarks will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.

Atari Games DQN Replay Dataset +2

dm_control: Software and Tasks for Continuous Control

1 code implementation22 Jun 2020 Yuval Tassa, Saran Tunyasuvunakool, Alistair Muldal, Yotam Doron, Piotr Trochim, Si-Qi Liu, Steven Bohez, Josh Merel, Tom Erez, Timothy Lillicrap, Nicolas Heess

The dm_control software package is a collection of Python libraries and task suites for reinforcement learning agents in an articulated-body simulation.

Continuous Control reinforcement-learning

Deep neuroethology of a virtual rodent

no code implementations ICLR 2020 Josh Merel, Diego Aldarondo, Jesse Marshall, Yuval Tassa, Greg Wayne, Bence Ölveczky

In this work, we develop a virtual rodent as a platform for the grounded study of motor activity in artificial models of embodied control.

Catch & Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks

no code implementations15 Nov 2019 Josh Merel, Saran Tunyasuvunakool, Arun Ahuja, Yuval Tassa, Leonard Hasenclever, Vu Pham, Tom Erez, Greg Wayne, Nicolas Heess

We address the longstanding challenge of producing flexible, realistic humanoid character controllers that can perform diverse whole-body tasks involving object interactions.

Emergent Coordination Through Competition

no code implementations ICLR 2019 Si-Qi Liu, Guy Lever, Josh Merel, Saran Tunyasuvunakool, Nicolas Heess, Thore Graepel

We study the emergence of cooperative behaviors in reinforcement learning agents by introducing a challenging competitive multi-agent soccer environment with continuous simulated physics.

Continuous Control reinforcement-learning

Neural probabilistic motor primitives for humanoid control

no code implementations ICLR 2019 Josh Merel, Leonard Hasenclever, Alexandre Galashov, Arun Ahuja, Vu Pham, Greg Wayne, Yee Whye Teh, Nicolas Heess

We focus on the problem of learning a single motor module that can flexibly express a range of behaviors for the control of high-dimensional physically simulated humanoids.

Graph networks as learnable physics engines for inference and control

1 code implementation ICML 2018 Alvaro Sanchez-Gonzalez, Nicolas Heess, Jost Tobias Springenberg, Josh Merel, Martin Riedmiller, Raia Hadsell, Peter Battaglia

Understanding and interacting with everyday physical scenes requires rich knowledge about the structure of the world, represented either implicitly in a value or policy function, or explicitly in a transition model.

DeepMind Control Suite

4 code implementations2 Jan 2018 Yuval Tassa, Yotam Doron, Alistair Muldal, Tom Erez, Yazhe Li, Diego de Las Casas, David Budden, Abbas Abdolmaleki, Josh Merel, Andrew Lefrancq, Timothy Lillicrap, Martin Riedmiller

The DeepMind Control Suite is a set of continuous control tasks with a standardised structure and interpretable rewards, intended to serve as performance benchmarks for reinforcement learning agents.

Continuous Control reinforcement-learning

Robust Imitation of Diverse Behaviors

no code implementations NeurIPS 2017 Ziyu Wang, Josh Merel, Scott Reed, Greg Wayne, Nando de Freitas, Nicolas Heess

Compared to purely supervised methods, Generative Adversarial Imitation Learning (GAIL) can learn more robust controllers from fewer demonstrations, but is inherently mode-seeking and more difficult to train.

Imitation Learning

Learning human behaviors from motion capture by adversarial imitation

1 code implementation7 Jul 2017 Josh Merel, Yuval Tassa, Dhruva TB, Sriram Srinivasan, Jay Lemmon, Ziyu Wang, Greg Wayne, Nicolas Heess

Rapid progress in deep reinforcement learning has made it increasingly feasible to train controllers for high-dimensional humanoid bodies.

Imitation Learning reinforcement-learning

Emergence of Locomotion Behaviours in Rich Environments

6 code implementations7 Jul 2017 Nicolas Heess, Dhruva TB, Srinivasan Sriram, Jay Lemmon, Josh Merel, Greg Wayne, Yuval Tassa, Tom Erez, Ziyu Wang, S. M. Ali Eslami, Martin Riedmiller, David Silver

The reinforcement learning paradigm allows, in principle, for complex behaviours to be learned directly from simple reward signals.


Neuroprosthetic decoder training as imitation learning

no code implementations13 Nov 2015 Josh Merel, David Carlson, Liam Paninski, John P. Cunningham

We describe how training a decoder in this way is a novel variant of an imitation learning problem, where an oracle or expert is employed for supervised training in lieu of direct observations, which are not available.

Imitation Learning

Bayesian spike inference from calcium imaging data

5 code implementations27 Nov 2013 Eftychios A. Pnevmatikakis, Josh Merel, Ari Pakman, Liam Paninski

We present efficient Bayesian methods for extracting neuronal spiking information from calcium imaging data.

Neurons and Cognition Quantitative Methods Applications

Cannot find the paper you are looking for? You can Submit a new open access paper.