no code implementations • ICLR 2019 • Xiaoyu Lu, Jan Stuehmer, Katja Hofmann
In this paper, we use a generative model to capture different emergent playstyles in an unsupervised manner, enabling the imitation of a diverse range of distinct behaviours.
no code implementations • 25 May 2023 • David Lindner, Xin Chen, Sebastian Tschiatschek, Katja Hofmann, Andreas Krause
We propose Convex Constraint Learning for Reinforcement Learning (CoCoRL), a novel approach for inferring shared constraints in a Constrained Markov Decision Process (CMDP) from a set of safe demonstrations with possibly different reward functions.
no code implementations • 2 Mar 2023 • Stephanie Milani, Arthur Juliani, Ida Momennejad, Raluca Georgescu, Jaroslaw Rzpecki, Alison Shaw, Gavin Costello, Fei Fang, Sam Devlin, Katja Hofmann
We aim to understand how people assess human likeness in navigation produced by people and artificially intelligent (AI) agents in a video game.
no code implementations • 15 Feb 2023 • Mingfei Sun, Benjamin Ellis, Anuj Mahajan, Sam Devlin, Katja Hofmann, Shimon Whiteson
In this paper, we show that the trust region constraint over policies can be safely substituted by a trust-region-free constraint without compromising the underlying monotonic improvement guarantee.
1 code implementation • 25 Jan 2023 • Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin
This paper studies their application as observation-to-action models for imitating human behaviour in sequential environments.
1 code implementation • 20 Nov 2022 • Micah Carroll, Orr Paradise, Jessy Lin, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin
Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks.
1 code implementation • 20 Jun 2022 • Massimiliano Patacchiola, John Bronskill, Aliaksandra Shysheya, Katja Hofmann, Sebastian Nowozin, Richard E. Turner
In this paper we push this Pareto frontier in the few-shot image classification setting with a key contribution: a new adaptive block called Contextual Squeeze-and-Excitation (CaSE) that adjusts a pretrained neural network on a new task to significantly improve performance with a single forward pass of the user data (context).
Ranked #3 on
Few-Shot Image Classification
on Meta-Dataset
1 code implementation • 10 Jun 2022 • David Lindner, Sebastian Tschiatschek, Katja Hofmann, Andreas Krause
We provide an instance-dependent lower bound for constrained linear best-arm identification and show that ACOL's sample complexity matches the lower bound in the worst-case.
no code implementations • 5 May 2022 • Julia Kiseleva, Ziming Li, Mohammad Aliannejadi, Shrestha Mohanty, Maartje ter Hoeve, Mikhail Burtsev, Alexey Skrynnik, Artem Zholus, Aleksandr Panov, Kavya Srinet, Arthur Szlam, Yuxuan Sun, Marc-Alexandre Côté, Katja Hofmann, Ahmed Awadallah, Linar Abdrazakov, Igor Churin, Putra Manggala, Kata Naszadi, Michiel van der Meer, Taewoon Kim
The primary goal of the competition is to approach the problem of how to build interactive agents that learn to solve a task while provided with grounded natural language instructions in a collaborative environment.
no code implementations • 28 Apr 2022 • Micah Carroll, Jessy Lin, Orr Paradise, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin
Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks.
no code implementations • 31 Jan 2022 • Mingfei Sun, Vitaly Kurin, Guoqing Liu, Sam Devlin, Tao Qin, Katja Hofmann, Shimon Whiteson
Furthermore, we show that ESPO can be easily scaled up to distributed training with many workers, delivering strong performance as well.
no code implementations • 31 Jan 2022 • Mingfei Sun, Sam Devlin, Jacob Beck, Katja Hofmann, Shimon Whiteson
We present trust region bounds for optimizing decentralized policies in cooperative Multi-Agent Reinforcement Learning (MARL), which holds even when the transition dynamics are non-stationary.
1 code implementation • 11 Dec 2021 • Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson
Sample efficiency is crucial for imitation learning methods to be applicable in real-world applications.
no code implementations • 13 Oct 2021 • Julia Kiseleva, Ziming Li, Mohammad Aliannejadi, Shrestha Mohanty, Maartje ter Hoeve, Mikhail Burtsev, Alexey Skrynnik, Artem Zholus, Aleksandr Panov, Kavya Srinet, Arthur Szlam, Yuxuan Sun, Katja Hofmann, Michel Galley, Ahmed Awadallah
Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions.
1 code implementation • 30 Jul 2021 • Robert Loftin, Aadirupa Saha, Sam Devlin, Katja Hofmann
High sample complexity remains a barrier to the application of reinforcement learning (RL), particularly in multi-agent systems.
no code implementations • 2 Jul 2021 • Grgur Kovač, Rémy Portelas, Katja Hofmann, Pierre-Yves Oudeyer
In this paper, we argue that aiming towards human-level AI requires a broader set of key social skills: 1) language use in complex and variable social contexts; 2) beyond language, complex embodied communication in multimodal settings within constantly evolving social worlds.
2 code implementations • NeurIPS 2021 • John Bronskill, Daniela Massiceti, Massimiliano Patacchiola, Katja Hofmann, Sebastian Nowozin, Richard E. Turner
This limitation arises because a task's entire support set, which can contain up to 1000 images, must be processed before an optimization step can be taken.
1 code implementation • NeurIPS 2021 • Tristan Karch, Laetitia Teodorescu, Katja Hofmann, Clément Moulin-Frier, Pierre-Yves Oudeyer
While there is an extended literature studying how machines can learn grounded language, the topic of how to learn spatio-temporal linguistic concepts is still largely uncharted.
no code implementations • 6 Jun 2021 • Mingfei Sun, Anuj Mahajan, Katja Hofmann, Shimon Whiteson
We present SoftDICE, which achieves state-of-the-art performance for imitation learning.
1 code implementation • 20 May 2021 • Sam Devlin, Raluca Georgescu, Ida Momennejad, Jaroslaw Rzepecki, Evelyn Zuniga, Gavin Costello, Guy Leroy, Ali Shaw, Katja Hofmann
A key challenge on the path to developing agents that learn complex human-like behavior is the need to quickly and accurately quantify human-likeness.
no code implementations • 27 Apr 2021 • Grgur Kovač, Rémy Portelas, Katja Hofmann, Pierre-Yves Oudeyer
Building embodied autonomous agents capable of participating in social interactions with humans is one of the main challenges in AI.
1 code implementation • ICCV 2021 • Daniela Massiceti, Luisa Zintgraf, John Bronskill, Lida Theodorou, Matthew Tobias Harris, Edward Cutrell, Cecily Morrison, Katja Hofmann, Simone Stumpf
To close this gap, we present the ORBIT dataset and benchmark, grounded in the real-world application of teachable object recognizers for people who are blind/low-vision.
1 code implementation • 17 Mar 2021 • Clément Romac, Rémy Portelas, Katja Hofmann, Pierre-Yves Oudeyer
Training autonomous agents able to generalize to multiple tasks is a key target of Deep Reinforcement Learning (DRL) research.
no code implementations • 14 Jan 2021 • Paul Knott, Micah Carroll, Sam Devlin, Kamil Ciosek, Katja Hofmann, A. D. Dragan, Rohin Shah
We apply this methodology to build a suite of unit tests for the Overcooked-AI environment, and use this test suite to evaluate three proposals for improving robustness.
no code implementations • 11 Jan 2021 • Luisa Zintgraf, Sam Devlin, Kamil Ciosek, Shimon Whiteson, Katja Hofmann
The optimal adaptive behaviour under uncertainty over the other agents' strategies w. r. t.
no code implementations • 16 Nov 2020 • Rémy Portelas, Clément Romac, Katja Hofmann, Pierre-Yves Oudeyer
In such complex task spaces, it is essential to rely on some form of Automatic Curriculum Learning (ACL) to adapt the task sampling distribution to a given learning agent, instead of randomly sampling tasks, as many could end up being either trivial or unfeasible.
1 code implementation • 2 Oct 2020 • Luisa Zintgraf, Leo Feng, Cong Lu, Maximilian Igl, Kristian Hartikainen, Katja Hofmann, Shimon Whiteson
To rapidly learn a new task, it is often essential for agents to explore efficiently -- especially when performance matters from the first timestep.
no code implementations • 1 Sep 2020 • Mikhail Jacob, Sam Devlin, Katja Hofmann
We compare with literature from the research community that address the challenges identified and conclude by highlighting promising directions for future research supporting agent creation in the games industry.
2 code implementations • 15 Jun 2020 • Rika Antonova, Maksim Maydanskiy, Danica Kragic, Sam Devlin, Katja Hofmann
Our second contribution is a unifying mathematical formulation for learning latent relations.
no code implementations • 12 May 2020 • Brandon Houghton, Stephanie Milani, Nicholay Topin, William Guss, Katja Hofmann, Diego Perez-Liebana, Manuela Veloso, Ruslan Salakhutdinov
To encourage the development of methods with reproducible and robust training behavior, we propose a challenge paradigm where competitors are evaluated directly on the performance of their learning procedures rather than pre-trained agents.
no code implementations • ICLR 2020 • Kamil Ciosek, Vincent Fortuin, Ryota Tomioka, Katja Hofmann, Richard Turner
Obtaining high-quality uncertainty estimates is essential for many applications of deep neural networks.
no code implementations • ICLR 2020 • Jacob Beck, Kamil Ciosek, Sam Devlin, Sebastian Tschiatschek, Cheng Zhang, Katja Hofmann
In many partially observable scenarios, Reinforcement Learning (RL) agents must rely on long-term memory in order to learn an optimal policy.
no code implementations • 9 Apr 2020 • Laetitia Teodorescu, Katja Hofmann, Pierre-Yves Oudeyer
Recognizing precise geometrical configurations of groups of objects is a key capability of human spatial cognition, yet little studied in the deep learning literature so far.
no code implementations • 7 Apr 2020 • Rémy Portelas, Katja Hofmann, Pierre-Yves Oudeyer
A major challenge in the Deep RL (DRL) community is to train agents able to generalize over unseen situations, which is often approached by training them on a diversity of tasks (or environments).
no code implementations • 10 Mar 2020 • Rémy Portelas, Cédric Colas, Lilian Weng, Katja Hofmann, Pierre-Yves Oudeyer
Automatic Curriculum Learning (ACL) has become a cornerstone of recent successes in Deep Reinforcement Learning (DRL). These methods shape the learning trajectories of agents by challenging them with tasks adapted to their capacities.
1 code implementation • NeurIPS 2019 • Kamil Ciosek, Quan Vuong, Robert Loftin, Katja Hofmann
To address both of these phenomena, we introduce a new algorithm, Optimistic Actor Critic, which approximates a lower and upper confidence bound on the state-action value function.
1 code implementation • NeurIPS 2019 • Maximilian Igl, Kamil Ciosek, Yingzhen Li, Sebastian Tschiatschek, Cheng Zhang, Sam Devlin, Katja Hofmann
We discuss those differences and propose modifications to existing regularization techniques in order to better adapt them to RL.
no code implementations • 28 Oct 2019 • Kamil Ciosek, Quan Vuong, Robert Loftin, Katja Hofmann
To address both of these phenomena, we introduce a new algorithm, Optimistic Actor Critic, which approximates a lower and upper confidence bound on the state-action value function.
1 code implementation • 21 Oct 2019 • Steindor Saemundsson, Alexander Terenin, Katja Hofmann, Marc Peter Deisenroth
Learning workable representations of dynamical systems is becoming an increasingly important problem in a number of application areas.
3 code implementations • ICLR 2020 • Luisa Zintgraf, Kyriacos Shiarlis, Maximilian Igl, Sebastian Schulze, Yarin Gal, Katja Hofmann, Shimon Whiteson
Trading off exploration and exploitation in an unknown environment is key to maximising expected return during learning.
1 code implementation • 16 Oct 2019 • Rémy Portelas, Cédric Colas, Katja Hofmann, Pierre-Yves Oudeyer
We consider the problem of how a teacher algorithm can enable an unknown Deep Reinforcement Learning (DRL) student to become good at a skill over a wide range of diverse environments.
1 code implementation • 7 Oct 2019 • Ian A. Kash, Michael Sullins, Katja Hofmann
Counterfactual Regret Minimization (CFR) has found success in settings like poker which have both terminal states and perfect recall.
no code implementations • 4 Jun 2019 • Aristide Tossou, Christos Dimitrakakis, Jaroslaw Rzepecki, Katja Hofmann
We study two-player general sum repeated finite games where the rewards of each player are generated from an unknown distribution.
1 code implementation • 22 Apr 2019 • William H. Guss, Cayden Codel, Katja Hofmann, Brandon Houghton, Noboru Kuno, Stephanie Milani, Sharada Mohanty, Diego Perez Liebana, Ruslan Salakhutdinov, Nicholay Topin, Manuela Veloso, Phillip Wang
To that end, we introduce: (1) the Minecraft ObtainDiamond task, a sequential decision making environment requiring long-term planning, hierarchical control, and efficient exploration methods; and (2) the MineRL-v0 dataset, a large-scale collection of over 60 million state-action pairs of human demonstrations that can be resimulated into embodied trajectories with arbitrary modifications to game state and visuals.
2 code implementations • 23 Jan 2019 • Diego Perez-Liebana, Katja Hofmann, Sharada Prasanna Mohanty, Noburu Kuno, Andre Kramer, Sam Devlin, Raluca D. Gaina, Daniel Ionita
Learning in multi-agent scenarios is a fruitful research direction, but current approaches still show scalability problems in multiple games with general reward settings and different opponent types.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
2 code implementations • NeurIPS 2019 • David Janz, Jiri Hron, Przemysław Mazur, Katja Hofmann, José Miguel Hernández-Lobato, Sebastian Tschiatschek
Posterior sampling for reinforcement learning (PSRL) is an effective method for balancing exploration and exploitation in reinforcement learning.
1 code implementation • 8 Oct 2018 • Luisa M. Zintgraf, Kyriacos Shiarlis, Vitaly Kurin, Katja Hofmann, Shimon Whiteson
We propose CAVIA for meta-learning, a simple extension to MAML that is less prone to meta-overfitting, easier to parallelise, and more interpretable.
no code implementations • 27 Sep 2018 • Luisa M Zintgraf, Kyriacos Shiarlis, Vitaly Kurin, Katja Hofmann, Shimon Whiteson
We propose CAML, a meta-learning method for fast adaptation that partitions the model parameters into two parts: context parameters that serve as additional input to the model and are adapted on individual tasks, and shared parameters that are meta-trained and shared across tasks.
no code implementations • 29 May 2018 • Justas Dauparas, Ryota Tomioka, Katja Hofmann
The question of how to explore, i. e., take actions with uncertain outcomes to learn about possible future rewards, is a key question in reinforcement learning (RL).
no code implementations • 23 May 2018 • Sebastian Tschiatschek, Kai Arulkumaran, Jan Stühmer, Katja Hofmann
In this paper we propose DELIP, an approach to model learning for POMDPs that utilizes amortized structured variational inference.
no code implementations • 9 May 2018 • Daniel Cohen, Bhaskar Mitra, Katja Hofmann, W. Bruce Croft
We use an adversarial discriminator and train our neural ranking model on a small set of domains.
Information Retrieval
no code implementations • 20 Mar 2018 • Steindór Sæmundsson, Katja Hofmann, Marc Peter Deisenroth
Learning from small data sets is critical in many practical applications where data collection is time consuming or expensive, e. g., robotics, animal experiments or drug design.
2 code implementations • 31 May 2017 • Vitaly Kurin, Sebastian Nowozin, Katja Hofmann, Lucas Beyer, Bastian Leibe
Recent progress in Reinforcement Learning (RL), fueled by its combination, with Deep Learning has enabled impressive results in learning to interact with complex virtual environments, yet real-world applications of RL are still scarce.
no code implementations • 21 Nov 2016 • Felix Leibfried, Nate Kushman, Katja Hofmann
Reinforcement learning is concerned with identifying reward-maximizing behaviour policies in environments that are initially unknown.
no code implementations • 21 Nov 2016 • Christoph Dann, Katja Hofmann, Sebastian Nowozin
The study of memory as information that flows from the past to the current action opens avenues to understand and improve successful reinforcement learning algorithms.
no code implementations • 14 Jun 2016 • Philipp Geiger, Katja Hofmann, Bernhard Schölkopf
The amount of digitally available but heterogeneous information about the world is remarkable, and new technologies such as self-driving cars, smart homes, or the internet of things may further increase it.
no code implementations • 23 Feb 2015 • Miroslav Dudík, Katja Hofmann, Robert E. Schapire, Aleksandrs Slivkins, Masrour Zoghi
The first of these algorithms achieves particularly low regret, even when data is adversarial, although its time and space requirements are linear in the size of the policy space.