Search Results for author: Florent Delgrange

Found 6 papers, 2 papers with code

Synthesis of Hierarchical Controllers Based on Deep Reinforcement Learning Policies

no code implementations • 21 Feb 2024 • Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowé, Guillermo A. Pérez

We propose a novel approach to the problem of controller design for environments modeled as Markov decision processes (MDPs).

reinforcement-learning

Paper
Add Code

Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees

1 code implementation • 22 Mar 2023 • Florent Delgrange, Ann Nowé, Guillermo A. Pérez

Our approach yields bisimulation guarantees while learning the distilled policy, allowing concrete optimization of the abstraction and representation model quality.

Paper
Code

The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models

no code implementations • 6 Mar 2023 • Raphael Avalos, Florent Delgrange, Ann Nowé, Guillermo A. Pérez, Diederik M. Roijers

Maintaining a probability distribution that models the belief over what the true state is can be used as a sufficient statistic of the history, but its computation requires access to the model of the environment and is often intractable.

Paper
Add Code

Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes (Technical Report)

1 code implementation • 17 Dec 2021 • Florent Delgrange, Ann Nowé, Guillermo A. Pérez

Finally, we show how one can use a policy obtained via state-of-the-art RL to efficiently train a variational autoencoder that yields a discrete latent model with provably approximately correct bisimulation guarantees.

Reinforcement Learning (RL)

Paper
Code

Simple Strategies in Multi-Objective MDPs (Technical Report)

no code implementations • 24 Oct 2019 • Florent Delgrange, Joost-Pieter Katoen, Tim Quatmann, Mickael Randour

That is, strategies that are pure (no randomization) and have bounded memory.

Paper
Add Code

Life is Random, Time is Not: Markov Decision Processes with Window Objectives

no code implementations • 11 Jan 2019 • Thomas Brihaye, Florent Delgrange, Youssouf Oualhadj, Mickael Randour

The window mechanism was introduced by Chatterjee et al. to strengthen classical game objectives with time bounds.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.