Specifying Non-Markovian Rewards in MDPs Using LDL on Finite Traces (Preliminary Version)

25 Jun 2017Ronen BrafmanGiuseppe De GiacomoFabio Patrizi

In Markov Decision Processes (MDPs), the reward obtained in a state depends on the properties of the last state and action. This state dependency makes it difficult to reward more interesting long-term behaviors, such as always closing a door after it has been opened, or providing coffee only following a request... (read more)

PDF Abstract

Code


No code implementations yet. Submit your code now

Tasks


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet