DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

16 Mar 2020Aviral KumarAbhishek GuptaSergey Levine

Deep reinforcement learning can learn effective policies for a wide range of tasks, but is notoriously difficult to use due to instability and sensitivity to hyperparameters. The reasons for this remain unclear... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Meta-Learning MT50 DisCor Average Success Rate 26% # 3

Methods used in the Paper