Search Results for author: Markus Wulfmeier

Found 25 papers, 3 papers with code

Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation

no code implementations1 Dec 2021 Todor Davchev, Oleg Sushkov, Jean-Baptiste Regli, Stefan Schaal, Yusuf Aytar, Markus Wulfmeier, Jon Scholz

In this work, we extend hindsight relabelling mechanisms to guide exploration along task-specific distributions implied by a small set of successful demonstrations.

Continuous Control

Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration

no code implementations17 Sep 2021 Oliver Groth, Markus Wulfmeier, Giulia Vezzani, Vibhavari Dasagi, Tim Hertweck, Roland Hafner, Nicolas Heess, Martin Riedmiller

Curiosity-based reward schemes can present powerful exploration mechanisms which facilitate the discovery of solutions for complex, sparse or long-horizon tasks.

From Motor Control to Team Play in Simulated Humanoid Football

1 code implementation25 May 2021 SiQi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds.

Decision Making Imitation Learning +2

Representation Matters: Improving Perception and Exploration for Robotics

no code implementations3 Nov 2020 Markus Wulfmeier, Arunkumar Byravan, Tim Hertweck, Irina Higgins, Ankush Gupta, tejas kulkarni, Malcolm Reynolds, Denis Teplyashin, Roland Hafner, Thomas Lampe, Martin Riedmiller

Furthermore, the value of each representation is evaluated in terms of three properties: dimensionality, observability and disentanglement.

Simple Sensor Intentions for Exploration

no code implementations15 May 2020 Tim Hertweck, Martin Riedmiller, Michael Bloesch, Jost Tobias Springenberg, Noah Siegel, Markus Wulfmeier, Roland Hafner, Nicolas Heess

In particular, we show that a real robotic arm can learn to grasp and lift and solve a Ball-in-a-Cup task from scratch, when only raw sensor streams are used for both controller input and in the auxiliary reward definition.

Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics

no code implementations2 Jan 2020 Michael Neunert, Abbas Abdolmaleki, Markus Wulfmeier, Thomas Lampe, Jost Tobias Springenberg, Roland Hafner, Francesco Romano, Jonas Buchli, Nicolas Heess, Martin Riedmiller

In contrast, we propose to treat hybrid problems in their 'native' form by solving them with hybrid reinforcement learning, which optimizes for discrete and continuous actions simultaneously.

Disentangled Cumulants Help Successor Representations Transfer to New Tasks

no code implementations25 Nov 2019 Christopher Grimm, Irina Higgins, Andre Barreto, Denis Teplyashin, Markus Wulfmeier, Tim Hertweck, Raia Hadsell, Satinder Singh

This is in contrast to the state-of-the-art reinforcement learning agents, which typically start learning each new task from scratch and struggle with knowledge transfer.

Transfer Learning

Attention-Privileged Reinforcement Learning

no code implementations19 Nov 2019 Sasha Salter, Dushyant Rao, Markus Wulfmeier, Raia Hadsell, Ingmar Posner

Image-based Reinforcement Learning is known to suffer from poor sample efficiency and generalisation to unseen visuals such as distractors (task-independent aspects of the observation space).

Attention Privileged Reinforcement Learning for Domain Transfer

no code implementations25 Sep 2019 Sasha Salter, Dushyant Rao, Markus Wulfmeier, Raia Hadsell, Ingmar Posner

Applying reinforcement learning (RL) to physical systems presents notable challenges, given requirements regarding sample efficiency, safety, and physical constraints compared to simulated environments.

Guiding Physical Intuition with Neural Stethoscopes

no code implementations ICLR 2019 Fabian Fuchs, Oliver Groth, Adam Kosiorek, Alex Bewley, Markus Wulfmeier, Andrea Vedaldi, Ingmar Posner

Using an adversarial stethoscope, the network is successfully de-biased, leading to a performance increase from 66% to 88%.

Efficient Supervision for Robot Learning via Imitation, Simulation, and Adaptation

no code implementations15 Apr 2019 Markus Wulfmeier

Recent successes in machine learning have led to a shift in the design of autonomous systems, improving performance on existing tasks and rendering new applications possible.

Domain Adaptation Imitation Learning

On Machine Learning and Structure for Mobile Robots

no code implementations15 Jun 2018 Markus Wulfmeier

Due to recent advances - compute, data, models - the role of learning in autonomous systems has expanded significantly, rendering new applications possible for the first time.

Scrutinizing and De-Biasing Intuitive Physics with Neural Stethoscopes

no code implementations14 Jun 2018 Fabian B. Fuchs, Oliver Groth, Adam R. Kosiorek, Alex Bewley, Markus Wulfmeier, Andrea Vedaldi, Ingmar Posner

Conversely, training on an easy dataset where visual cues are positively correlated with stability, the baseline model learns a bias leading to poor performance on a harder dataset.

TACO: Learning Task Decomposition via Temporal Alignment for Control

1 code implementation ICML 2018 Kyriacos Shiarlis, Markus Wulfmeier, Sasha Salter, Shimon Whiteson, Ingmar Posner

Many advanced Learning from Demonstration (LfD) methods consider the decomposition of complex, real-world tasks into simpler sub-tasks.

Incremental Adversarial Domain Adaptation for Continually Changing Environments

no code implementations20 Dec 2017 Markus Wulfmeier, Alex Bewley, Ingmar Posner

Continuous appearance shifts such as changes in weather and lighting conditions can impact the performance of deployed machine learning models.

Unsupervised Domain Adaptation

Mutual Alignment Transfer Learning

no code implementations25 Jul 2017 Markus Wulfmeier, Ingmar Posner, Pieter Abbeel

Training robots for operation in the real world is a complex, time consuming and potentially expensive task.

Fine-tuning Transfer Learning

Reverse Curriculum Generation for Reinforcement Learning

no code implementations17 Jul 2017 Carlos Florensa, David Held, Markus Wulfmeier, Michael Zhang, Pieter Abbeel

The robot is trained in reverse, gradually learning to reach the goal from a set of start states increasingly far from the goal.

Addressing Appearance Change in Outdoor Robotics with Adversarial Domain Adaptation

no code implementations4 Mar 2017 Markus Wulfmeier, Alex Bewley, Ingmar Posner

Appearance changes due to weather and seasonal conditions represent a strong impediment to the robust implementation of machine learning systems in outdoor robotics.

Autonomous Driving Motion Planning +1

Incorporating Human Domain Knowledge into Large Scale Cost Function Learning

no code implementations13 Dec 2016 Markus Wulfmeier, Dushyant Rao, Ingmar Posner

Recent advances have shown the capability of Fully Convolutional Neural Networks (FCN) to model cost functions for motion planning in the context of learning driving preferences purely based on demonstration data from human drivers.

Motion Planning

Watch This: Scalable Cost-Function Learning for Path Planning in Urban Environments

no code implementations8 Jul 2016 Markus Wulfmeier, Dominic Zeng Wang, Ingmar Posner

In this work, we present an approach to learn cost maps for driving in complex urban environments from a very large number of demonstrations of driving behaviour by human experts.

Maximum Entropy Deep Inverse Reinforcement Learning

1 code implementation17 Jul 2015 Markus Wulfmeier, Peter Ondruska, Ingmar Posner

This paper presents a general framework for exploiting the representational capacity of neural networks to approximate complex, nonlinear reward functions in the context of solving the inverse reinforcement learning (IRL) problem.

Cannot find the paper you are looking for? You can Submit a new open access paper.