no code implementations • 5 Jun 2023 • Patrick Benjamin, Alessandro Abate
For comparison purposes with our new architecture, we modify recent algorithms for the centralised and independent cases to make their practical convergence feasible: while contributing the first empirical demonstrations of these algorithms in our setting of $N$ agents learning along a single system evolution with only local state observability, we additionally display the empirical benefits of our new, networked approach.
no code implementations • 12 Apr 2023 • Maico Hendrikus Wilhelmus Engelaar, Licio Romao, Yulong Gao, Mircea Lazar, Alessandro Abate, Sofie Haesaert
We propose a correct-by-design controller synthesis framework for discrete-time linear stochastic systems that provides more flexibility to the overall abstraction framework of stochastic systems.
no code implementations • 10 Apr 2023 • Frederik Baymler Mathiesen, Licio Romao, Simeon C. Calvert, Alessandro Abate, Luca Laurenti
Our approach is based on a novel approach to synthesise a stochastic barrier function from noise data.
no code implementations • 2 Apr 2023 • Licio Romao, Ashish R. Hota, Alessandro Abate
We develop a distributionally robust framework to perform dynamic programming using kernel methods and apply our framework to design feedback control policies that satisfy safety and optimality specifications.
no code implementations • 30 Mar 2023 • Adrien Banse, Licio Romao, Alessandro Abate, Raphaël M. Jungers
In order to learn the optimal structure, we define a Kantorovich-inspired metric between Markov chains, and we use it as a loss function.
no code implementations • 23 Mar 2023 • Zifan Wang, Yulong Gao, Siyi Wang, Michael M. Zavlanos, Alessandro Abate, Karl H. Johansson
Distributional reinforcement learning (DRL) enhances the understanding of the effects of the randomness in the environment by letting agents learn the distribution of a random return, rather than its expected value as in standard RL.
1 code implementation • 27 Jan 2023 • Alessandro Abate, Alec Edwards, Mirco Giacobbe
We present a novel method for the safety verification of nonlinear dynamical models that uses neural networks to represent abstractions of their dynamics.
no code implementations • 5 Jan 2023 • Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, Michael Wooldridge
Regarding question iii), we describe correspondences between causal games and other formalisms, and explain how causal games can be used to answer queries that other causal or game-theoretic models do not support.
1 code implementation • 4 Jan 2023 • Thom Badings, Licio Romao, Alessandro Abate, David Parker, Hasan A. Poonawala, Marielle Stoelinga, Nils Jansen
This iMDP is, with a user-specified confidence probability, robust against uncertainty in the transition probabilities, and the tightness of the probability intervals can be controlled through the number of samples.
1 code implementation • 28 Dec 2022 • Joar Skalse, Lewis Hammond, Charlie Griffin, Alessandro Abate
In this work we introduce reinforcement learning techniques for solving lexicographic multi-objective problems.
Multi-Objective Reinforcement Learning
reinforcement-learning
no code implementations • 6 Dec 2022 • Joar Skalse, Alessandro Abate
In this paper, we provide a mathematical analysis of how robust different IRL models are to misspecification, and answer precisely how the demonstrator policy may differ from each of the standard models before that model leads to faulty inferences about the reward function $R$.
no code implementations • 4 Dec 2022 • Adrien Banse, Licio Romao, Alessandro Abate, Raphaël M. Jungers
We propose a sample-based, sequential method to abstract a (potentially black-box) dynamical system with a sequence of memory-dependent Markov chains of increasing size.
no code implementations • 1 Dec 2022 • Luke Rickard, Thom Badings, Licio Romao, Alessandro Abate
We consider the cases where the transition probabilities of this MDP are either known up to an interval or completely unknown.
1 code implementation • 12 Oct 2022 • Thom Badings, Licio Romao, Alessandro Abate, Nils Jansen
Stochastic noise causes aleatoric uncertainty, whereas imprecise knowledge of model parameters leads to epistemic uncertainty.
no code implementations • 30 Sep 2022 • Daniel Jarne Ornia, Licio Romao, Lewis Hammond, Manuel Mazo Jr., Alessandro Abate
Policy gradient algorithms that have strong convergence guarantees are usually modified to obtain robust policies in ways that do not preserve algorithm guarantees, which defeats the purpose of formal robustness requirements.
1 code implementation • 21 Sep 2022 • Hosein Hasanbeig, Daniel Kroening, Alessandro Abate
LCRL is a software tool that implements model-free Reinforcement Learning (RL) algorithms over unknown Markov Decision Processes (MDPs), synthesising policies that satisfy a given linear temporal specification with maximal probability.
no code implementations • 25 Aug 2022 • Alessandro Abate, Yousif Almulla, James Fox, David Hyland, Michael Wooldridge
Second, we propose a novel method for distilling the task automaton (assumed to be a deterministic finite automaton) from the learnt product MDP.
no code implementations • 12 Aug 2022 • Scott R. Jeen, Alessandro Abate, Jonathan M. Cullen
Heating and cooling systems in buildings account for 31\% of global energy use, much of which are regulated by Rule Based Controllers (RBCs) that neither maximise energy efficiency nor minimise emissions by interacting optimally with the grid.
no code implementations • 28 Jun 2022 • Scott R. Jeen, Alessandro Abate, Jonathan M. Cullen
Heating and cooling systems in buildings account for 31% of global energy use, much of which are regulated by Rule Based Controllers (RBCs) that neither maximise energy efficiency nor minimise emissions by interacting optimally with the grid.
no code implementations • 14 Mar 2022 • Joar Skalse, Matthew Farrugia-Roberts, Stuart Russell, Alessandro Abate, Adam Gleave
It is often very challenging to manually design reward functions for complex, real-world tasks.
no code implementations • 25 Oct 2021 • Thom S. Badings, Alessandro Abate, Nils Jansen, David Parker, Hasan A. Poonawala, Marielle Stoelinga
We use state-of-the-art verification techniques to provide guarantees on the iMDP, and compute a controller for which these guarantees carry over to the autonomous system.
1 code implementation • 21 May 2021 • Matthew Wicker, Luca Laurenti, Andrea Patane, Nicola Paoletti, Alessandro Abate, Marta Kwiatkowska
We consider the problem of computing reach-avoid probabilities for iterative predictions made with Bayesian neural network (BNN) models.
1 code implementation • 24 Feb 2021 • Mingyu Cai, Mohammadhosein Hasanbeig, Shaoping Xiao, Alessandro Abate, Zhen Kan
This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP) with unknown transition probabilities over continuous state and action spaces.
1 code implementation • 9 Feb 2021 • Lewis Hammond, James Fox, Tom Everitt, Alessandro Abate, Michael Wooldridge
Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations.
1 code implementation • 1 Feb 2021 • Lewis Hammond, Alessandro Abate, Julian Gutierrez, Michael Wooldridge
In this paper, we study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment, which may exhibit probabilistic behaviour.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
1 code implementation • 7 Aug 2020 • Kyriakos Polymenakos, Nikitas Rontsis, Alessandro Abate, Stephen Roberts
SafePILCO is a software tool for safe and data-efficient policy search with reinforcement learning.
no code implementations • 21 Jul 2020 • Daniele Ahmed, Andrea Peruffo, Alessandro Abate
In this paper we employ SMT solvers to soundly synthesise Lyapunov functions that assert the stability of a given dynamical model.
no code implementations • 7 Jul 2020 • Andrea Peruffo, Daniele Ahmed, Alessandro Abate
We introduce an automated, formal, counterexample-based approach to synthesise Barrier Certificates (BC) for the safety verification of continuous and hybrid dynamical models.
no code implementations • 6 Jul 2020 • Thomas J. Ringstrom, Mohammadhosein Hasanbeig, Alessandro Abate
In Hierarchical Control, compositionality, abstraction, and task-transfer are crucial for designing versatile algorithms which can solve a variety of problems with maximal representational reuse.
1 code implementation • NeurIPS 2020 • Francesco Cosentino, Harald Oberhauser, Alessandro Abate
Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms and has the same mean when integrated against each of the $n$ functions.
1 code implementation • 2 Jun 2020 • Francesco Cosentino, Harald Oberhauser, Alessandro Abate
Various flavours of Stochastic Gradient Descent (SGD) replace the expensive summation that computes the full gradient by approximating it with a small sum over a randomly selected subsample of the data set that in turn suffers from a high variance.
no code implementations • 19 Mar 2020 • Alessandro Abate, Daniele Ahmed, Mirco Giacobbe, Andrea Peruffo
We employ a counterexample-guided approach where a numerical learner and a symbolic verifier interact to construct provably correct Lyapunov neural networks (LNNs).
no code implementations • 26 Feb 2020 • Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening
This paper presents the concept of an adaptive safe padding that forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process.
no code implementations • 29 Nov 2019 • Kyriakos Polymenakos, Luca Laurenti, Andrea Patane, Jan-Peter Calliess, Luca Cardelli, Marta Kwiatkowska, Alessandro Abate, Stephen Roberts
Gaussian Processes (GPs) are widely employed in control and learning because of their principled treatment of uncertainty.
1 code implementation • 22 Nov 2019 • Mohammadhosein Hasanbeig, Natasha Yogananda Jeppu, Alessandro Abate, Tom Melham, Daniel Kroening
This paper proposes DeepSynth, a method for effective training of deep Reinforcement Learning (RL) agents when the reward is sparse and non-Markovian, but at the same time progress towards the reward requires achieving an unknown sequence of high-level objectives.
2 code implementations • 23 Sep 2019 • Lim Zun Yuan, Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening
We propose an actor-critic, model-free, and online Reinforcement Learning (RL) framework for continuous-state continuous-action Markov Decision Processes (MDPs) when the reward is highly sparse but encompasses a high-level temporal structure.
1 code implementation • 11 Sep 2019 • Mohammadhosein Hasanbeig, Yiannis Kantaros, Alessandro Abate, Daniel Kroening, George J. Pappas, Insup Lee
Reinforcement Learning (RL) has emerged as an efficient method of choice for solving complex sequential decision making problems in automatic control, computer science, economics, and biology.
1 code implementation • 2 Feb 2019 • Hosein Hasanbeig, Daniel Kroening, Alessandro Abate
Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied to a variety of control problems.
no code implementations • 20 Sep 2018 • Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening
We propose a method for efficient training of Q-functions for continuous-state Markov Decision Processes (MDPs) such that the traces of the resulting policies satisfy a given Linear Temporal Logic (LTL) property.
1 code implementation • 24 Jan 2018 • Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening
With this reward function, the policy synthesis procedure is "constrained" by the given specification.
1 code implementation • 15 Dec 2017 • Kyriakos Polymenakos, Alessandro Abate, Stephen Roberts
We propose a method to optimise the parameters of a policy which will be used to safely perform a given task in a data-efficient manner.
no code implementations • 5 Jul 2017 • Elizabeth Polgreen, Viraj Wijesuriya, Sofie Haesaert, Alessandro Abate
We present a new method for statistical verification of quantitative properties over a partially unknown system with actions, utilising a parameterised model (in this work, a parametric Markov decision process) and data collected from experiments performed on the underlying system.
no code implementations • 1 Sep 2014 • Sofie Haesaert, Robert Babuska, Alessandro Abate
This article deals with stochastic processes endowed with the Markov (memoryless) property and evolving over general (uncountable) state spaces.