Browse > Methodology > Imitation Learning

Imitation Learning

33 papers with code · Methodology

State-of-the-art leaderboards

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

Neural Modular Control for Embodied Question Answering

26 Oct 2018facebookresearch/House3D

Independent reinforcement learning at each level of hierarchy enables sub-policies to adapt to consequences of their actions and recover from errors. Subsequent joint hierarchical training enables the master policy to adapt to the sub-policies.

EMBODIED QUESTION ANSWERING IMITATION LEARNING QUESTION ANSWERING

STARDATA: A StarCraft AI Research Dataset

7 Aug 2017TorchCraft/StarData

We provide full game state data along with the original replays that can be viewed in StarCraft. We illustrate the diversity of the data with various statistics and provide examples of tasks that benefit from the dataset.

IMITATION LEARNING STARCRAFT

Gated-Attention Architectures for Task-Oriented Language Grounding

22 Jun 2017devendrachaplot/DeepRL-Grounding

To perform tasks specified by natural language instructions, autonomous agents need to extract semantically meaningful representations of language and map it to visual elements and actions in the environment. This problem is called task-oriented language grounding.

IMITATION LEARNING

Self-Imitation Learning

ICML 2018 junhyukoh/self-imitation-learning

This paper proposes Self-Imitation Learning (SIL), a simple off-policy actor-critic algorithm that learns to reproduce the agent's past good decisions. This algorithm is designed to verify our hypothesis that exploiting past good experiences can indirectly drive deep exploration.

ATARI GAMES IMITATION LEARNING

InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations

NeurIPS 2017 YunzhuLi/InfoGAIL

The goal of imitation learning is to mimic expert behavior without access to an explicit reward signal. Expert demonstrations provided by humans, however, often show significant variability due to latent factors that are typically not explicitly modeled.

IMITATION LEARNING

Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems

HLT 2018 google-research-datasets/simulated-dialogue

Popular methods for learning task-oriented dialogues include applying reinforcement learning with user feedback on supervised pre-training models. To address this challenge, we propose a hybrid imitation and reinforcement learning method, with which a dialogue agent can effectively learn from its interaction with users by learning from human teaching and feedback.

DIALOGUE STATE TRACKING IMITATION LEARNING TASK-ORIENTED DIALOGUE SYSTEMS

Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow

ICLR 2019 akanimax/Variational_Discriminator_Bottleneck

Adversarial learning methods have been proposed for a wide range of applications, but the training of adversarial models can be notoriously unstable. By enforcing a constraint on the mutual information between the observations and the discriminator's internal representation, we can effectively modulate the discriminator's accuracy and maintain useful and informative gradients.

CONTINUOUS CONTROL IMAGE GENERATION IMITATION LEARNING

Go-Explore: a New Approach for Hard-Exploration Problems

30 Jan 2019uber-research/go-explore

Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge. On Pitfall, Go-Explore with domain knowledge is the first algorithm to score above zero.

IMITATION LEARNING MONTEZUMA'S REVENGE

CoDraw: Collaborative Drawing as a Testbed for Grounded Goal-driven Communication

ICLR 2019 facebookresearch/CoDraw

Our game is grounded in a virtual world that contains movable clip art objects. The game involves two players: a Teller and a Drawer.

IMITATION LEARNING

Query-Efficient Imitation Learning for End-to-End Autonomous Driving

20 May 2016mbhenaff/EEN

A policy function trained in this way however is known to suffer from unexpected behaviours due to the mismatch between the states reachable by the reference policy and trained policy functions. In this paper, we propose an extension of the DAgger, called SafeDAgger, that is query-efficient and more suitable for end-to-end autonomous driving.

AUTONOMOUS DRIVING IMITATION LEARNING