no code implementations • 19 Jan 2023 • Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson
Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible.
no code implementations • 26 Jun 2022 • Darius Muglich, Luisa Zintgraf, Christian Schroeder de Witt, Shimon Whiteson, Jakob Foerster
Self-play is a common paradigm for constructing solutions in Markov games that can yield optimal policies in collaborative settings.
no code implementations • 1 Dec 2021 • Zheng Xiong, Luisa Zintgraf, Jacob Beck, Risto Vuorio, Shimon Whiteson
We further find that theoretically inconsistent algorithms can be made consistent by continuing to update all agent components on the OOD tasks, and adapt as well or better than originally consistent ones.
1 code implementation • 17 Jul 2021 • Samuel Sokota, Christian Schroeder de Witt, Maximilian Igl, Luisa Zintgraf, Philip Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob Foerster
We contribute a theoretically grounded approach to MCGs based on maximum entropy reinforcement learning and minimum entropy coupling that we call MEME.
no code implementations • 21 Jun 2021 • Alexandra Moringen, Sören Rüttgers, Luisa Zintgraf, Jason Friedman, Helge Ritter
Ideally, a focus on a particular practice method should be made in a way to maximize the learner's progress in learning to play the piano.
no code implementations • 17 Apr 2021 • Eltayeb Ahmed, Luisa Zintgraf, Christian A. Schroeder de Witt, Nicolas Usunier
In this work we explore an auxiliary loss useful for reinforcement learning in environments where strong performing agents are required to be able to navigate a spatial environment.
1 code implementation • ICCV 2021 • Daniela Massiceti, Luisa Zintgraf, John Bronskill, Lida Theodorou, Matthew Tobias Harris, Edward Cutrell, Cecily Morrison, Katja Hofmann, Simone Stumpf
To close this gap, we present the ORBIT dataset and benchmark, grounded in the real-world application of teachable object recognizers for people who are blind/low-vision.
no code implementations • 11 Jan 2021 • Luisa Zintgraf, Sam Devlin, Kamil Ciosek, Shimon Whiteson, Katja Hofmann
The optimal adaptive behaviour under uncertainty over the other agents' strategies w. r. t.
1 code implementation • 2 Oct 2020 • Luisa Zintgraf, Leo Feng, Cong Lu, Maximilian Igl, Kristian Hartikainen, Katja Hofmann, Shimon Whiteson
To rapidly learn a new task, it is often essential for agents to explore efficiently -- especially when performance matters from the first timestep.
no code implementations • 29 Nov 2019 • Leo Feng, Luisa Zintgraf, Bei Peng, Shimon Whiteson
In few-shot learning, typically, the loss function which is applied at test time is the one we are ultimately interested in minimising, such as the mean-squared-error loss for a regression problem.
3 code implementations • ICLR 2020 • Luisa Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson
Trading off exploration and exploitation in an unknown environment is key to maximising expected return during learning.
1 code implementation • ICML 2018 • Maximilian Igl, Luisa Zintgraf, Tuan Anh Le, Frank Wood, Shimon Whiteson
Many real-world sequential decision making problems are partially observable by nature, and the environment model is typically unknown.