Search Results for author: Tuomas Haarnoja

Found 15 papers, 6 papers with code

NeRF2Real: Sim2real Transfer of Vision-guided Bipedal Motion Skills using Neural Radiance Fields

no code implementations10 Oct 2022 Arunkumar Byravan, Jan Humplik, Leonard Hasenclever, Arthur Brussee, Francesco Nori, Tuomas Haarnoja, Ben Moran, Steven Bohez, Fereshteh Sadeghi, Bojan Vujatovic, Nicolas Heess

A simulation is then created using the rendering engine in a physics simulator which computes contact dynamics from the static scene geometry (estimated from the NeRF volume density) and the dynamic objects' geometry and physical properties (assumed known).

Novel View Synthesis

Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data

no code implementations12 Apr 2022 Wenxuan Zhou, Steven Bohez, Jan Humplik, Abbas Abdolmaleki, Dushyant Rao, Markus Wulfmeier, Tuomas Haarnoja, Nicolas Heess

We propose the Offline Distillation Pipeline to break this trade-off by separating the training procedure into an online interaction phase and an offline distillation phase. Second, we find that training with the imbalanced off-policy data from multiple environments across the lifetime creates a significant performance drop.

Reinforcement Learning (RL)

From Motor Control to Team Play in Simulated Humanoid Football

1 code implementation25 May 2021 SiQi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds.

Imitation Learning Multi-agent Reinforcement Learning +1

Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery

no code implementations ICLR 2020 Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine

We show that dynamical distances can be used in a semi-supervised regime, where unsupervised interaction with the environment is used to learn the dynamical distances, while a small amount of preference supervision is used to determine the task goal, without any manually engineered reward function or goal examples.

reinforcement-learning Reinforcement Learning (RL)

Learning to Walk via Deep Reinforcement Learning

no code implementations26 Dec 2018 Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker, Sergey Levine

In this paper, we propose a sample-efficient deep RL algorithm based on maximum entropy RL that requires minimal per-task tuning and only a modest number of trials to learn neural network policies.

reinforcement-learning Reinforcement Learning (RL)

Latent Space Policies for Hierarchical Reinforcement Learning

no code implementations ICML 2018 Tuomas Haarnoja, Kristian Hartikainen, Pieter Abbeel, Sergey Levine

In contrast to methods that explicitly restrict or cripple lower layers of a hierarchy to force them to use higher-level modulating signals, each layer in our framework is trained to directly solve the task, but acquires a range of diverse strategies via a maximum entropy reinforcement learning objective.

Hierarchical Reinforcement Learning reinforcement-learning +1

Composable Deep Reinforcement Learning for Robotic Manipulation

1 code implementation19 Mar 2018 Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine

Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies.

Q-Learning reinforcement-learning +1

Reinforcement Learning with Deep Energy-Based Policies

3 code implementations ICML 2017 Tuomas Haarnoja, Haoran Tang, Pieter Abbeel, Sergey Levine

We propose a method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before.

Q-Learning reinforcement-learning +1

Backprop KF: Learning Discriminative Deterministic State Estimators

1 code implementation NeurIPS 2016 Tuomas Haarnoja, Anurag Ajay, Sergey Levine, Pieter Abbeel

We show that this procedure can be used to train state estimators that use complex input, such as raw camera images, which must be processed using expressive nonlinear function approximators such as convolutional neural networks.

Autonomous Vehicles Visual Odometry

Cannot find the paper you are looking for? You can Submit a new open access paper.