Search Results for author: Jost Tobias Springenberg

Found 38 papers, 17 papers with code

Collect & Infer -- a fresh look at data-efficient Reinforcement Learning

no code implementations23 Aug 2021 Martin Riedmiller, Jost Tobias Springenberg, Roland Hafner, Nicolas Heess

This position paper proposes a fresh look at Reinforcement Learning (RL) from the perspective of data-efficiency.

Training Generative Adversarial Networks by Solving Ordinary Differential Equations

1 code implementation NeurIPS 2020 Chongli Qin, Yan Wu, Jost Tobias Springenberg, Andrew Brock, Jeff Donahue, Timothy P. Lillicrap, Pushmeet Kohli

From this perspective, we hypothesise that instabilities in training GANs arise from the integration error in discretising the continuous dynamics.

Learning Dexterous Manipulation from Suboptimal Experts

no code implementations16 Oct 2020 Rae Jeong, Jost Tobias Springenberg, Jackie Kay, Daniel Zheng, Yuxiang Zhou, Alexandre Galashov, Nicolas Heess, Francesco Nori

Although in many cases the learning process could be guided by demonstrations or other suboptimal experts, current RL algorithms for continuous action spaces often fail to effectively utilize combinations of highly off-policy expert data and on-policy exploration data.

Offline RL Q-Learning

Local Search for Policy Iteration in Continuous Control

no code implementations12 Oct 2020 Jost Tobias Springenberg, Nicolas Heess, Daniel Mankowitz, Josh Merel, Arunkumar Byravan, Abbas Abdolmaleki, Jackie Kay, Jonas Degrave, Julian Schrittwieser, Yuval Tassa, Jonas Buchli, Dan Belov, Martin Riedmiller

We demonstrate that additional computation spent on model-based policy improvement during learning can improve data efficiency, and confirm that model-based policy improvement during action selection can also be beneficial.

Continuous Control

Critic Regularized Regression

no code implementations NeurIPS 2020 Ziyu Wang, Alexander Novikov, Konrad Zolna, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Josh Merel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy optimization from large pre-recorded datasets without online environment interaction.

Offline RL

Simple Sensor Intentions for Exploration

no code implementations15 May 2020 Tim Hertweck, Martin Riedmiller, Michael Bloesch, Jost Tobias Springenberg, Noah Siegel, Markus Wulfmeier, Roland Hafner, Nicolas Heess

In particular, we show that a real robotic arm can learn to grasp and lift and solve a Ball-in-a-Cup task from scratch, when only raw sensor streams are used for both controller input and in the auxiliary reward definition.

Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics

no code implementations2 Jan 2020 Michael Neunert, Abbas Abdolmaleki, Markus Wulfmeier, Thomas Lampe, Jost Tobias Springenberg, Roland Hafner, Francesco Romano, Jonas Buchli, Nicolas Heess, Martin Riedmiller

In contrast, we propose to treat hybrid problems in their 'native' form by solving them with hybrid reinforcement learning, which optimizes for discrete and continuous actions simultaneously.

Quinoa: a Q-function You Infer Normalized Over Actions

no code implementations5 Nov 2019 Jonas Degrave, Abbas Abdolmaleki, Jost Tobias Springenberg, Nicolas Heess, Martin Riedmiller

We present an algorithm for learning an approximate action-value soft Q-function in the relative entropy regularised reinforcement learning setting, for which an optimal improved policy can be recovered in closed form.

Normalising Flows

Robust Reinforcement Learning for Continuous Control with Model Misspecification

no code implementations ICLR 2020 Daniel J. Mankowitz, Nir Levine, Rae Jeong, Yuanyuan Shi, Jackie Kay, Abbas Abdolmaleki, Jost Tobias Springenberg, Timothy Mann, Todd Hester, Martin Riedmiller

We provide a framework for incorporating robustness -- to perturbations in the transition dynamics which we refer to as model misspecification -- into continuous control Reinforcement Learning (RL) algorithms.

Continuous Control

Towards Automatically-Tuned Deep Neural Networks

2 code implementations18 May 2019 Hector Mendoza, Aaron Klein, Matthias Feurer, Jost Tobias Springenberg, Matthias Urban, Michael Burkart, Maximilian Dippel, Marius Lindauer, Frank Hutter

Recent advances in AutoML have led to automated tools that can compete with machine learning experts on supervised learning tasks.


Relative Entropy Regularized Policy Iteration

1 code implementation5 Dec 2018 Abbas Abdolmaleki, Jost Tobias Springenberg, Jonas Degrave, Steven Bohez, Yuval Tassa, Dan Belov, Nicolas Heess, Martin Riedmiller

Our algorithm draws on connections to existing literature on black-box optimization and 'RL as an inference' and it can be seen either as an extension of the Maximum a Posteriori Policy Optimisation algorithm (MPO) [Abdolmaleki et al., 2018a], or as an extension of Trust Region Covariance Matrix Adaptation Evolutionary Strategy (CMA-ES) [Abdolmaleki et al., 2017b; Hansen et al., 1997] to a policy iteration scheme.

Continuous Control OpenAI Gym

Maximum a Posteriori Policy Optimisation

1 code implementation ICLR 2018 Abbas Abdolmaleki, Jost Tobias Springenberg, Yuval Tassa, Remi Munos, Nicolas Heess, Martin Riedmiller

We introduce a new algorithm for reinforcement learning called Maximum aposteriori Policy Optimisation (MPO) based on coordinate ascent on a relative entropy objective.

Continuous Control

Graph networks as learnable physics engines for inference and control

1 code implementation ICML 2018 Alvaro Sanchez-Gonzalez, Nicolas Heess, Jost Tobias Springenberg, Josh Merel, Martin Riedmiller, Raia Hadsell, Peter Battaglia

Understanding and interacting with everyday physical scenes requires rich knowledge about the structure of the world, represented either implicitly in a value or policy function, or explicitly in a transition model.

Learning an Embedding Space for Transferable Robot Skills

no code implementations ICLR 2018 Karol Hausman, Jost Tobias Springenberg, Ziyu Wang, Nicolas Heess, Martin Riedmiller

We present a method for reinforcement learning of closely related skills that are parameterized via a skill embedding space.

Variational Inference

Deep learning with convolutional neural networks for EEG decoding and visualization

5 code implementations15 Mar 2017 Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer, Martin Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard, Tonio Ball

PLEASE READ AND CITE THE REVISED VERSION at Human Brain Mapping: http://onlinelibrary. wiley. com/doi/10. 1002/hbm. 23730/full Code available here: https://github. com/robintibor/braindecode

EEG Eeg Decoding

Deep Reinforcement Learning with Successor Features for Navigation across Similar Environments

no code implementations16 Dec 2016 Jingwei Zhang, Jost Tobias Springenberg, Joschka Boedecker, Wolfram Burgard

We propose a successor feature based deep reinforcement learning algorithm that can learn to transfer knowledge from previously mastered navigation tasks to new problem instances.

Robot Navigation

Asynchronous Stochastic Gradient MCMC with Elastic Coupling

no code implementations2 Dec 2016 Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, Frank Hutter

We consider parallel asynchronous Markov Chain Monte Carlo (MCMC) sampling for problems where we can leverage (stochastic) gradients to define continuous dynamics which explore the target distribution.

Bayesian Optimization with Robust Bayesian Neural Networks

1 code implementation NeurIPS 2016 Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, Frank Hutter

Bayesian optimization is a prominent method for optimizing expensive to evaluate black-box functions that is prominently applied to tuning the hyperparameters of machine learning algorithms.

Hyperparameter Optimization

Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks

4 code implementations19 Nov 2015 Jost Tobias Springenberg

Our approach is based on an objective function that trades-off mutual information between observed examples and their predicted categorical class distribution, against robustness of the classifier to an adversarial generative model.

General Classification Robust classification +2

Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images

1 code implementation NeurIPS 2015 Manuel Watter, Jost Tobias Springenberg, Joschka Boedecker, Martin Riedmiller

We introduce Embed to Control (E2C), a method for model learning and control of non-linear dynamical systems from raw pixel images.

Learning to Generate Chairs With Convolutional Neural Networks

1 code implementation CVPR 2015 Alexey Dosovitskiy, Jost Tobias Springenberg, Thomas Brox

We train a generative convolutional neural network which is able to generate images of objects given object type, viewpoint, and color.

Striving for Simplicity: The All Convolutional Net

34 code implementations21 Dec 2014 Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, Martin Riedmiller

Most modern convolutional neural networks (CNNs) used for object recognition are built using the same principles: Alternating convolution and max-pooling layers followed by a small number of fully connected layers.

Image Classification Object Recognition

Learning to Generate Chairs, Tables and Cars with Convolutional Networks

1 code implementation21 Nov 2014 Alexey Dosovitskiy, Jost Tobias Springenberg, Maxim Tatarchenko, Thomas Brox

We train generative 'up-convolutional' neural networks which are able to generate images of objects given object style, viewpoint, and color.

Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks

1 code implementation26 Jun 2014 Alexey Dosovitskiy, Philipp Fischer, Jost Tobias Springenberg, Martin Riedmiller, Thomas Brox

While such generic features cannot compete with class specific features from supervised training on a classification task, we show that they are advantageous on geometric matching problems, where they also outperform the SIFT descriptor.

General Classification Geometric Matching

Unsupervised feature learning by augmenting single images

no code implementations18 Dec 2013 Alexey Dosovitskiy, Jost Tobias Springenberg, Thomas Brox

We then extend these trivial one-element classes by applying a variety of transformations to the initial 'seed' patches.

Data Augmentation Object Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.