Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations

12 Apr 2019Daniel S. BrownWonjoon GooPrabhat NagarajanScott Niekum

A critical flaw of existing inverse reinforcement learning (IRL) methods is their inability to significantly outperform the demonstrator. This is because IRL typically seeks a reward function that makes the demonstrator appear near-optimal, rather than inferring the underlying intentions of the demonstrator that may have been poorly executed in practice... (read more)

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.