Search Results for author: Risto Vuorio

Found 16 papers, 6 papers with code

SplAgger: Split Aggregation for Meta-Reinforcement Learning

no code implementations5 Mar 2024 Jacob Beck, Matthew Jackson, Risto Vuorio, Zheng Xiong, Shimon Whiteson

However, it remains unclear whether task inference sequence models are beneficial even when task inference objectives are not.

Continuous Control Meta Reinforcement Learning +2

Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control

no code implementations9 Feb 2024 Zheng Xiong, Risto Vuorio, Jacob Beck, Matthieu Zimmer, Kun Shao, Shimon Whiteson

Learning a universal policy across different robot morphologies can significantly improve learning efficiency and enable zero-shot generalization to unseen morphologies.

Zero-shot Generalization

Recurrent Hypernetworks are Surprisingly Strong in Meta-RL

1 code implementation NeurIPS 2023 Jacob Beck, Risto Vuorio, Zheng Xiong, Shimon Whiteson

While many specialized meta-RL methods have been proposed, recent work suggests that end-to-end learning in conjunction with an off-the-shelf sequential model, such as a recurrent network, is a surprisingly strong baseline.

Few-Shot Learning Reinforcement Learning (RL)

A Survey of Meta-Reinforcement Learning

no code implementations19 Jan 2023 Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson

Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible.

Meta Reinforcement Learning reinforcement-learning +1

Deconfounded Imitation Learning

no code implementations4 Nov 2022 Risto Vuorio, Johann Brehmer, Hanno Ackermann, Daniel Dijkman, Taco Cohen, Pim de Haan

Standard imitation learning can fail when the expert demonstrators have different sensory inputs than the imitating agent.

Imitation Learning

Hypernetworks in Meta-Reinforcement Learning

1 code implementation20 Oct 2022 Jacob Beck, Matthew Thomas Jackson, Risto Vuorio, Shimon Whiteson

In this paper, we 1) show that hypernetwork initialization is also a critical factor in meta-RL, and that naive initializations yield poor performance; 2) propose a novel hypernetwork initialization scheme that matches or exceeds the performance of a state-of-the-art approach proposed for supervised settings, as well as being simpler and more general; and 3) use this method to show that hypernetworks can improve performance in meta-RL by evaluating on multiple simulated robotics benchmarks.

Meta Reinforcement Learning reinforcement-learning +1

An Investigation of the Bias-Variance Tradeoff in Meta-Gradients

1 code implementation22 Sep 2022 Risto Vuorio, Jacob Beck, Shimon Whiteson, Jakob Foerster, Gregory Farquhar

Meta-gradients provide a general approach for optimizing the meta-parameters of reinforcement learning (RL) algorithms.

Meta-Learning Reinforcement Learning (RL)

On the Practical Consistency of Meta-Reinforcement Learning Algorithms

no code implementations1 Dec 2021 Zheng Xiong, Luisa Zintgraf, Jacob Beck, Risto Vuorio, Shimon Whiteson

We further find that theoretically inconsistent algorithms can be made consistent by continuing to update all agent components on the OOD tasks, and adapt as well or better than originally consistent ones.

Meta-Learning Meta Reinforcement Learning +3

Learning State Representations from Random Deep Action-conditional Predictions

1 code implementation NeurIPS 2021 Zeyu Zheng, Vivek Veeriah, Risto Vuorio, Richard Lewis, Satinder Singh

Our main contribution in this work is an empirical finding that random General Value Functions (GVFs), i. e., deep action-conditional predictions -- random both in what feature of observations they predict as well as in the sequence of actions the predictions are conditioned upon -- form good auxiliary tasks for reinforcement learning (RL) problems.

Atari Games Reinforcement Learning (RL) +2

Adaptive Pairwise Weights for Temporal Credit Assignment

no code implementations9 Feb 2021 Zeyu Zheng, Risto Vuorio, Richard Lewis, Satinder Singh

In this empirical paper, we explore heuristics based on more general pairwise weightings that are functions of the state in which the action was taken, the state at the time of the reward, as well as the time interval between the two.

Reinforcement Learning (RL)

Deep Reinforcement Learning for Multi-Driver Vehicle Dispatching and Repositioning Problem

no code implementations25 Nov 2019 John Holler, Risto Vuorio, Zhiwei Qin, Xiaocheng Tang, Yan Jiao, Tiancheng Jin, Satinder Singh, Chenxi Wang, Jieping Ye

Order dispatching and driver repositioning (also known as fleet management) in the face of spatially and temporally varying supply and demand are central to a ride-sharing platform marketplace.

BIG-bench Machine Learning Decision Making +3

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

2 code implementations NeurIPS 2019 Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim

Model-agnostic meta-learners aim to acquire meta-learned parameters from similar tasks to adapt to novel tasks from the same distribution with few gradient updates.

Few-Shot Image Classification Few-Shot Learning +3

Toward Multimodal Model-Agnostic Meta-Learning

no code implementations18 Dec 2018 Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim

One important limitation of such frameworks is that they seek a common initialization shared across the entire task distribution, substantially limiting the diversity of the task distributions that they are able to learn from.

Few-Shot Image Classification Meta-Learning

Model-Agnostic Meta-Learning for Multimodal Task Distributions

no code implementations27 Sep 2018 Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim

In this paper, we augment MAML with the capability to identify tasks sampled from a multimodal task distribution and adapt quickly through gradient updates.

Few-Shot Image Classification Meta-Learning

Meta Continual Learning

no code implementations11 Jun 2018 Risto Vuorio, Dong-Yeon Cho, Daejoong Kim, Jiwon Kim

This ability is limited in the current deep neural networks by a problem called catastrophic forgetting, where training on new tasks tends to severely degrade performance on previous tasks.

Continual Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.