Search Results for author: Tuomas Haarnoja

Found 15 papers, 6 papers with code

Replay across Experiments: A Natural Extension of Off-Policy RL

no code implementations • 27 Nov 2023 • Dhruva Tirumala, Thomas Lampe, Jose Enrique Chen, Tuomas Haarnoja, Sandy Huang, Guy Lever, Ben Moran, Tim Hertweck, Leonard Hasenclever, Martin Riedmiller, Nicolas Heess, Markus Wulfmeier

Replaying data is a principal mechanism underlying the stability and data efficiency of off-policy reinforcement learning (RL).

Reinforcement Learning (RL)

Paper
Add Code

Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning

no code implementations • 26 Apr 2023 • Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala, Jan Humplik, Markus Wulfmeier, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley, Francesco Nori, Raia Hadsell, Nicolas Heess

We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments.

reinforcement-learning

Paper
Add Code

SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration

no code implementations • 24 Nov 2022 • Giulia Vezzani, Dhruva Tirumala, Markus Wulfmeier, Dushyant Rao, Abbas Abdolmaleki, Ben Moran, Tuomas Haarnoja, Jan Humplik, Roland Hafner, Michael Neunert, Claudio Fantacci, Tim Hertweck, Thomas Lampe, Fereshteh Sadeghi, Nicolas Heess, Martin Riedmiller

The ability to effectively reuse prior knowledge is a key requirement when building general and flexible Reinforcement Learning (RL) agents.

Reinforcement Learning (RL)

Paper
Add Code

NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields

no code implementations • 10 Oct 2022 • Arunkumar Byravan, Jan Humplik, Leonard Hasenclever, Arthur Brussee, Francesco Nori, Tuomas Haarnoja, Ben Moran, Steven Bohez, Fereshteh Sadeghi, Bojan Vujatovic, Nicolas Heess

A simulation is then created using the rendering engine in a physics simulator which computes contact dynamics from the static scene geometry (estimated from the NeRF volume density) and the dynamic objects' geometry and physical properties (assumed known).

Novel View Synthesis

Paper
Add Code

Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data

no code implementations • 12 Apr 2022 • Wenxuan Zhou, Steven Bohez, Jan Humplik, Abbas Abdolmaleki, Dushyant Rao, Markus Wulfmeier, Tuomas Haarnoja, Nicolas Heess

We propose the Offline Distillation Pipeline to break this trade-off by separating the training procedure into an online interaction phase and an offline distillation phase. Second, we find that training with the imbalanced off-policy data from multiple environments across the lifetime creates a significant performance drop.

Reinforcement Learning (RL)

Paper
Add Code

Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors

no code implementations • 31 Mar 2022 • Steven Bohez, Saran Tunyasuvunakool, Philemon Brakel, Fereshteh Sadeghi, Leonard Hasenclever, Yuval Tassa, Emilio Parisotto, Jan Humplik, Tuomas Haarnoja, Roland Hafner, Markus Wulfmeier, Michael Neunert, Ben Moran, Noah Siegel, Andrea Huber, Francesco Romano, Nathan Batchelor, Federico Casarini, Josh Merel, Raia Hadsell, Nicolas Heess

We investigate the use of prior knowledge of human and animal movement to learn reusable locomotion skills for real legged robots.

Paper
Add Code

From Motor Control to Team Play in Simulated Humanoid Football

1 code implementation • 25 May 2021 • SiQi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds.

Imitation Learning Multi-agent Reinforcement Learning +1

3,545

Paper
Code

Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery

no code implementations • ICLR 2020 • Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine

We show that dynamical distances can be used in a semi-supervised regime, where unsupervised interaction with the environment is used to learn the dynamical distances, while a small amount of preference supervision is used to determine the task goal, without any manually engineered reward function or goal examples.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning to Walk via Deep Reinforcement Learning

no code implementations • 26 Dec 2018 • Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker, Sergey Levine

In this paper, we propose a sample-efficient deep RL algorithm based on maximum entropy RL that requires minimal per-task tuning and only a modest number of trials to learn neural network policies.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Soft Actor-Critic Algorithms and Applications

50 code implementations • 13 Dec 2018 • Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, Sergey Levine

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Decision Making reinforcement-learning +1

10,371

Paper
Code

Latent Space Policies for Hierarchical Reinforcement Learning

no code implementations • ICML 2018 • Tuomas Haarnoja, Kristian Hartikainen, Pieter Abbeel, Sergey Levine

In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategies via a maximum entropy reinforcement learning objective.

Hierarchical Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Composable Deep Reinforcement Learning for Robotic Manipulation

1 code implementation • 19 Mar 2018 • Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine

Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies.

Q-Learning reinforcement-learning +1

409

Paper
Code

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

76 code implementations • ICML 2018 • Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine

A platform for Applied Reinforcement Learning (Applied RL)

Ranked #1 on Continuous Control on Lunar Lander (OpenAI Gym)

Continuous Control Decision Making +3

31,107

Paper
Code

Reinforcement Learning with Deep Energy-Based Policies

3 code implementations • ICML 2017 • Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine

We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before.

Q-Learning reinforcement-learning +1

2,551

Paper
Code

Backprop KF: Learning Discriminative Deterministic State Estimators

1 code implementation • NeurIPS 2016 • Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel

We show that this procedure can be used to train state estimators that use complex input, such as raw camera images, which must be processed using expressive nonlinear function approximators such as convolutional neural networks.

Autonomous Vehicles Visual Odometry

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.