Search Results for author: Markus Wulfmeier

Found 43 papers, 5 papers with code

Imitating Language via Scalable Inverse Reinforcement Learning

no code implementations2 Sep 2024 Markus Wulfmeier, Michael Bloesch, Nino Vieillard, Arun Ahuja, Jorg Bornschein, Sandy Huang, Artem Sokolov, Matt Barnes, Guillaume Desjardins, Alex Bewley, Sarah Maria Elisabeth Bechtle, Jost Tobias Springenberg, Nikola Momchev, Olivier Bachem, Matthieu Geist, Martin Riedmiller

We focus on investigating the inverse reinforcement learning (IRL) perspective to imitation, extracting rewards and directly optimizing sequences instead of individual token likelihoods and evaluate its benefits for fine-tuning large language models.

Diversity Imitation Learning +3

Growing Q-Networks: Solving Continuous Control Tasks with Adaptive Control Resolution

no code implementations5 Apr 2024 Tim Seyde, Peter Werner, Wilko Schwarting, Markus Wulfmeier, Daniela Rus

Recent reinforcement learning approaches have shown surprisingly strong capabilities of bang-bang policies for solving continuous control benchmarks.

Continuous Control Q-Learning

Foundations for Transfer in Reinforcement Learning: A Taxonomy of Knowledge Modalities

no code implementations4 Dec 2023 Markus Wulfmeier, Arunkumar Byravan, Sarah Bechtle, Karol Hausman, Nicolas Heess

Contemporary artificial intelligence systems exhibit rapidly growing abilities accompanied by the growth of required resources, expansive datasets and corresponding investments into computing infrastructure.

Computational Efficiency reinforcement-learning +1

Equivariant Data Augmentation for Generalization in Offline Reinforcement Learning

no code implementations14 Sep 2023 Cristina Pinneri, Sarah Bechtle, Markus Wulfmeier, Arunkumar Byravan, Jingwei Zhang, William F. Whitney, Martin Riedmiller

We present a novel approach to address the challenge of generalization in offline reinforcement learning (RL), where the agent learns from a fixed dataset without any additional interaction with the environment.

Data Augmentation Offline RL +2

Towards A Unified Agent with Foundation Models

no code implementations18 Jul 2023 Norman Di Palo, Arunkumar Byravan, Leonard Hasenclever, Markus Wulfmeier, Nicolas Heess, Martin Riedmiller

Language Models and Vision Language Models have recently demonstrated unprecedented capabilities in terms of understanding human intentions, reasoning, scene understanding, and planning-like behaviour, in text form, among many others.

Efficient Exploration Reinforcement Learning (RL) +2

Massively Scalable Inverse Reinforcement Learning in Google Maps

1 code implementation18 May 2023 Matt Barnes, Matthew Abueg, Oliver F. Lange, Matt Deeds, Jason Trader, Denali Molitor, Markus Wulfmeier, Shawn O'Banion

Inverse reinforcement learning (IRL) offers a powerful and general framework for learning humans' latent preferences in route recommendation, yet no approach has successfully addressed planetary-scale problems with hundreds of millions of states and demonstration trajectories.

reinforcement-learning

Solving Continuous Control via Q-learning

1 code implementation22 Oct 2022 Tim Seyde, Peter Werner, Wilko Schwarting, Igor Gilitschenski, Martin Riedmiller, Daniela Rus, Markus Wulfmeier

While there has been substantial success for solving continuous control with actor-critic methods, simpler critic-only methods such as Q-learning find limited application in the associated high-dimensional action spaces.

Continuous Control Multi-agent Reinforcement Learning +1

MO2: Model-Based Offline Options

no code implementations5 Sep 2022 Sasha Salter, Markus Wulfmeier, Dhruva Tirumala, Nicolas Heess, Martin Riedmiller, Raia Hadsell, Dushyant Rao

The ability to discover useful behaviours from past experience and transfer them to new tasks is considered a core component of natural embodied intelligence.

Continuous Control

Forgetting and Imbalance in Robot Lifelong Learning with Off-policy Data

no code implementations12 Apr 2022 Wenxuan Zhou, Steven Bohez, Jan Humplik, Abbas Abdolmaleki, Dushyant Rao, Markus Wulfmeier, Tuomas Haarnoja, Nicolas Heess

We propose the Offline Distillation Pipeline to break this trade-off by separating the training procedure into an online interaction phase and an offline distillation phase. Second, we find that training with the imbalanced off-policy data from multiple environments across the lifetime creates a significant performance drop.

Reinforcement Learning (RL)

The Challenges of Exploration for Offline Reinforcement Learning

no code implementations27 Jan 2022 Nathan Lambert, Markus Wulfmeier, William Whitney, Arunkumar Byravan, Michael Bloesch, Vibhavari Dasagi, Tim Hertweck, Martin Riedmiller

Offline Reinforcement Learning (ORL) enablesus to separately study the two interlinked processes of reinforcement learning: collecting informative experience and inferring optimal behaviour.

Model Predictive Control Offline RL +2

Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies

no code implementations ICLR 2022 Dushyant Rao, Fereshteh Sadeghi, Leonard Hasenclever, Markus Wulfmeier, Martina Zambelli, Giulia Vezzani, Dhruva Tirumala, Yusuf Aytar, Josh Merel, Nicolas Heess, Raia Hadsell

We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours, while retaining the flexibility of a continuous latent variable model.

Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation

no code implementations ICLR 2022 Todor Davchev, Oleg Sushkov, Jean-Baptiste Regli, Stefan Schaal, Yusuf Aytar, Markus Wulfmeier, Jon Scholz

In this work, we extend hindsight relabelling mechanisms to guide exploration along task-specific distributions implied by a small set of successful demonstrations.

Continuous Control Reinforcement Learning (RL)

Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration

no code implementations17 Sep 2021 Oliver Groth, Markus Wulfmeier, Giulia Vezzani, Vibhavari Dasagi, Tim Hertweck, Roland Hafner, Nicolas Heess, Martin Riedmiller

Curiosity-based reward schemes can present powerful exploration mechanisms which facilitate the discovery of solutions for complex, sparse or long-horizon tasks.

From Motor Control to Team Play in Simulated Humanoid Football

1 code implementation25 May 2021 SiQi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds.

Imitation Learning Multi-agent Reinforcement Learning +1

Simple Sensor Intentions for Exploration

no code implementations15 May 2020 Tim Hertweck, Martin Riedmiller, Michael Bloesch, Jost Tobias Springenberg, Noah Siegel, Markus Wulfmeier, Roland Hafner, Nicolas Heess

In particular, we show that a real robotic arm can learn to grasp and lift and solve a Ball-in-a-Cup task from scratch, when only raw sensor streams are used for both controller input and in the auxiliary reward definition.

Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics

no code implementations2 Jan 2020 Michael Neunert, Abbas Abdolmaleki, Markus Wulfmeier, Thomas Lampe, Jost Tobias Springenberg, Roland Hafner, Francesco Romano, Jonas Buchli, Nicolas Heess, Martin Riedmiller

In contrast, we propose to treat hybrid problems in their 'native' form by solving them with hybrid reinforcement learning, which optimizes for discrete and continuous actions simultaneously.

reinforcement-learning Reinforcement Learning (RL)

Disentangled Cumulants Help Successor Representations Transfer to New Tasks

no code implementations25 Nov 2019 Christopher Grimm, Irina Higgins, Andre Barreto, Denis Teplyashin, Markus Wulfmeier, Tim Hertweck, Raia Hadsell, Satinder Singh

This is in contrast to the state-of-the-art reinforcement learning agents, which typically start learning each new task from scratch and struggle with knowledge transfer.

Transfer Learning

Attention-Privileged Reinforcement Learning

no code implementations19 Nov 2019 Sasha Salter, Dushyant Rao, Markus Wulfmeier, Raia Hadsell, Ingmar Posner

Image-based Reinforcement Learning is known to suffer from poor sample efficiency and generalisation to unseen visuals such as distractors (task-independent aspects of the observation space).

reinforcement-learning Reinforcement Learning (RL)

Attention Privileged Reinforcement Learning for Domain Transfer

no code implementations25 Sep 2019 Sasha Salter, Dushyant Rao, Markus Wulfmeier, Raia Hadsell, Ingmar Posner

Applying reinforcement learning (RL) to physical systems presents notable challenges, given requirements regarding sample efficiency, safety, and physical constraints compared to simulated environments.

reinforcement-learning Reinforcement Learning (RL)

Guiding Physical Intuition with Neural Stethoscopes

no code implementations ICLR 2019 Fabian Fuchs, Oliver Groth, Adam Kosiorek, Alex Bewley, Markus Wulfmeier, Andrea Vedaldi, Ingmar Posner

Using an adversarial stethoscope, the network is successfully de-biased, leading to a performance increase from 66% to 88%.

Physical Intuition

Efficient Supervision for Robot Learning via Imitation, Simulation, and Adaptation

no code implementations15 Apr 2019 Markus Wulfmeier

Recent successes in machine learning have led to a shift in the design of autonomous systems, improving performance on existing tasks and rendering new applications possible.

BIG-bench Machine Learning Domain Adaptation +1

On Machine Learning and Structure for Mobile Robots

no code implementations15 Jun 2018 Markus Wulfmeier

Due to recent advances - compute, data, models - the role of learning in autonomous systems has expanded significantly, rendering new applications possible for the first time.

BIG-bench Machine Learning

Scrutinizing and De-Biasing Intuitive Physics with Neural Stethoscopes

no code implementations14 Jun 2018 Fabian B. Fuchs, Oliver Groth, Adam R. Kosiorek, Alex Bewley, Markus Wulfmeier, Andrea Vedaldi, Ingmar Posner

Conversely, training on an easy dataset where visual cues are positively correlated with stability, the baseline model learns a bias leading to poor performance on a harder dataset.

TACO: Learning Task Decomposition via Temporal Alignment for Control

1 code implementation ICML 2018 Kyriacos Shiarlis, Markus Wulfmeier, Sasha Salter, Shimon Whiteson, Ingmar Posner

Many advanced Learning from Demonstration (LfD) methods consider the decomposition of complex, real-world tasks into simpler sub-tasks.

Incremental Adversarial Domain Adaptation for Continually Changing Environments

no code implementations20 Dec 2017 Markus Wulfmeier, Alex Bewley, Ingmar Posner

Continuous appearance shifts such as changes in weather and lighting conditions can impact the performance of deployed machine learning models.

Generative Adversarial Network Unsupervised Domain Adaptation

Mutual Alignment Transfer Learning

no code implementations25 Jul 2017 Markus Wulfmeier, Ingmar Posner, Pieter Abbeel

Training robots for operation in the real world is a complex, time consuming and potentially expensive task.

Transfer Learning

Reverse Curriculum Generation for Reinforcement Learning

no code implementations17 Jul 2017 Carlos Florensa, David Held, Markus Wulfmeier, Michael Zhang, Pieter Abbeel

The robot is trained in reverse, gradually learning to reach the goal from a set of start states increasingly far from the goal.

reinforcement-learning Reinforcement Learning (RL)

Addressing Appearance Change in Outdoor Robotics with Adversarial Domain Adaptation

no code implementations4 Mar 2017 Markus Wulfmeier, Alex Bewley, Ingmar Posner

Appearance changes due to weather and seasonal conditions represent a strong impediment to the robust implementation of machine learning systems in outdoor robotics.

Autonomous Driving Motion Planning +1

Incorporating Human Domain Knowledge into Large Scale Cost Function Learning

no code implementations13 Dec 2016 Markus Wulfmeier, Dushyant Rao, Ingmar Posner

Recent advances have shown the capability of Fully Convolutional Neural Networks (FCN) to model cost functions for motion planning in the context of learning driving preferences purely based on demonstration data from human drivers.

Motion Planning reinforcement-learning +1

Watch This: Scalable Cost-Function Learning for Path Planning in Urban Environments

no code implementations8 Jul 2016 Markus Wulfmeier, Dominic Zeng Wang, Ingmar Posner

In this work, we present an approach to learn cost maps for driving in complex urban environments from a very large number of demonstrations of driving behaviour by human experts.

Maximum Entropy Deep Inverse Reinforcement Learning

2 code implementations17 Jul 2015 Markus Wulfmeier, Peter Ondruska, Ingmar Posner

This paper presents a general framework for exploiting the representational capacity of neural networks to approximate complex, nonlinear reward functions in the context of solving the inverse reinforcement learning (IRL) problem.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.