1 code implementation • 24 Feb 2021 • Mingyu Cai, Mohammadhosein Hasanbeig, Shaoping Xiao, Alessandro Abate, Zhen Kan
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP) with unknown transition probabilities over continuous state and action spaces.
1 code implementation • 20 Jan 2021 • Mirco Giacobbe, Mohammadhosein Hasanbeig, Daniel Kroening, Hjalmar Wijk
We present the first exact method for analysing and ensuring the safety of DRL agents for Atari games.
no code implementations • 6 Jul 2020 • Thomas J. Ringstrom, Mohammadhosein Hasanbeig, Alessandro Abate
In Hierarchical Control, compositionality, abstraction, and task-transfer are crucial for designing versatile algorithms which can solve a variety of problems with maximal representational reuse.
no code implementations • 26 Feb 2020 • Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening
This paper presents the concept of an adaptive safe padding that forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process.
1 code implementation • 22 Nov 2019 • Mohammadhosein Hasanbeig, Natasha Yogananda Jeppu, Alessandro Abate, Tom Melham, Daniel Kroening
This paper proposes DeepSynth, a method for effective training of deep Reinforcement Learning (RL) agents when the reward is sparse and non-Markovian, but at the same time progress towards the reward requires achieving an unknown sequence of high-level objectives.
2 code implementations • 23 Sep 2019 • Lim Zun Yuan, Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening
We propose an actor-critic, model-free, and online Reinforcement Learning (RL) framework for continuous-state continuous-action Markov Decision Processes (MDPs) when the reward is highly sparse but encompasses a high-level temporal structure.
1 code implementation • 11 Sep 2019 • Mohammadhosein Hasanbeig, Yiannis Kantaros, Alessandro Abate, Daniel Kroening, George J. Pappas, Insup Lee
Reinforcement Learning (RL) has emerged as an efficient method of choice for solving complex sequential decision making problems in automatic control, computer science, economics, and biology.
no code implementations • 20 Sep 2018 • Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening
We propose a method for efficient training of Q-functions for continuous-state Markov Decision Processes (MDPs) such that the traces of the resulting policies satisfy a given Linear Temporal Logic (LTL) property.
no code implementations • 7 Feb 2018 • Mohammadhosein Hasanbeig, Lacra Pavel
The main focus of this paper is on enhancement of two types of game-theoretic learning algorithms: log-linear learning and reinforcement learning.
1 code implementation • 24 Jan 2018 • Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening
With this reward function, the policy synthesis procedure is "constrained" by the given specification.