Using trace-driven and real-world experiments, we demonstrate significant improvements of Comyco's sample efficiency in comparison to prior work, with 1700x improvements in terms of the number of samples required and 16x improvements on training time required.
Imitation learning, followed by reinforcement learning algorithms, is a promising paradigm to solve complex control tasks sample-efficiently.
We propose an interactive-predictive neural machine translation framework for easier model personalization using reinforcement and imitation learning.
Self-supervised methods, wherein an agent learns representations solely by observing the results of its actions, become crucial in environments which do not provide a dense reward signal or have labels.
We consider the problem of imitation learning from expert demonstrations in partially observable Markov decision processes (POMDPs).
We propose to create such NPC behaviors interactively by training an agent in the target environment using imitation learning with a human in the loop.
This paper focuses on building a model that reasons about the long-term future and demonstrates how to use this for efficient planning and exploration.