Search Results for author: Guillermo A. Pérez

Found 16 papers, 3 papers with code

Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes (Technical Report)

1 code implementation • 17 Dec 2021 • Florent Delgrange, Ann Nowé, Guillermo A. Pérez

Finally, we show how one can use a policy obtained via state-of-the-art RL to efficiently train a variational autoencoder that yields a discrete latent model with provably approximately correct bisimulation guarantees.

Reinforcement Learning (RL)

Paper
Code

Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees

1 code implementation • 22 Mar 2023 • Florent Delgrange, Ann Nowé, Guillermo A. Pérez

Our approach yields bisimulation guarantees while learning the distilled policy, allowing concrete optimization of the abstraction and representation model quality.

Paper
Code

Optimizing Expectation with Guarantees in POMDPs (Technical Report)

1 code implementation • 26 Nov 2016 • Krishnendu Chatterjee, Petr Novotný, Guillermo A. Pérez, Jean-François Raskin, Đorđe Žikelić

In this work we go beyond both the "expectation" and "threshold" approaches and consider a "guaranteed payoff optimization (GPO)" problem for POMDPs, where we are given a threshold $t$ and the objective is to find a policy $\sigma$ such that a) each possible outcome of $\sigma$ yields a discounted-sum payoff of at least $t$, and b) the expected discounted-sum payoff of $\sigma$ is optimal (or near-optimal) among all policies satisfying a).

Paper
Code

Learning-Based Mean-Payoff Optimization in an Unknown MDP under Omega-Regular Constraints

no code implementations • 24 Apr 2018 • Jan Křetínský, Guillermo A. Pérez, Jean-François Raskin

Assuming the support of the unknown transition function and a lower bound on the minimal transition probability are known in advance, we show that in MDPs consisting of a single end component, two combinations of guarantees on the parity and mean-payoff objectives can be achieved depending on how much memory one is willing to use.

Paper
Add Code

On the Complexity of Value Iteration

no code implementations • 13 Jul 2018 • Nikhil Balaji, Stefan Kiefer, Petr Novotný, Guillermo A. Pérez, Mahsa Shirmohammadi

We show that, given a horizon $n$ in binary and an MDP, computing an optimal policy is EXP-complete, thus resolving an open problem that goes back to the seminal 1987 paper on the complexity of MDPs by Papadimitriou and Tsitsiklis.

Paper
Add Code

The Impatient May Use Limited Optimism to Minimize Regret

no code implementations • 17 Nov 2018 • Michaël Cadilhac, Guillermo A. Pérez, Marie van den Bogaard

Discounted-sum games provide a formal model for the study of reinforcement learning, where the agent is enticed to get rewards early since later rewards are discounted.

Paper
Add Code

Let's Agree to Degree: Comparing Graph Convolutional Networks in the Message-Passing Framework

no code implementations • 6 Apr 2020 • Floris Geerts, Filip Mazowiecki, Guillermo A. Pérez

In this paper we cast neural networks defined on graphs as message-passing neural networks (MPNNs) in order to study the distinguishing power of different classes of such models.

Paper
Add Code

Robustness Verification for Classifier Ensembles

no code implementations • 12 May 2020 • Dennis Gross, Nils Jansen, Guillermo A. Pérez, Stephan Raaijmakers

The robustness-checking problem consists of assessing, given a set of classifiers and a labelled data set, whether there exists a randomized attack that induces a certain expected loss against all classifiers.

Image Classification

Paper
Add Code

Safe Learning for Near Optimal Scheduling

no code implementations • 19 May 2020 • Damien Busatto-Gaston, Debraj Chakraborty, Shibashis Guha, Guillermo A. Pérez, Jean-François Raskin

In this paper, we investigate the combination of synthesis, model-based learning, and online sampling techniques to obtain safe and near-optimal schedulers for a preemptible task scheduling problem.

Q-Learning Scheduling

Paper
Add Code

Continuous One-Counter Automata

no code implementations • 28 Jan 2021 • Michael Blondin, Tim Leys, Filip Mazowiecki, Philip Oftermatt, Guillermo A. Pérez

Our three main results are as follows: (1) We prove that the reachability problem for COCA with global upper and lower bound tests is in NC2; (2) that, in general, the problem is decidable in polynomial time; and (3) that it is decidable in the polynomial hierarchy for COCA with parametric counter updates and bound tests.

Formal Languages and Automata Theory Logic in Computer Science

Paper
Add Code

Revisiting Parameter Synthesis for One-Counter Automata

no code implementations • 3 May 2020 • Guillermo A. Pérez, Ritam Raha

We study the (parameter) synthesis problem for one-counter automata with parameters.

Logic in Computer Science Formal Languages and Automata Theory

Paper
Add Code

The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models

no code implementations • 6 Mar 2023 • Raphael Avalos, Florent Delgrange, Ann Nowé, Guillermo A. Pérez, Diederik M. Roijers

Maintaining a probability distribution that models the belief over what the true state is can be used as a sufficient statistic of the history, but its computation requires access to the model of the environment and is often intractable.

Paper
Add Code

Graph-Based Reductions for Parametric and Weighted MDPs

no code implementations • 9 May 2023 • Kasper Engelen, Guillermo A. Pérez, Shrisha Rao

In terms of computational complexity, we establish that determining whether p is never worse than q is coETR-complete.

Paper
Add Code

Formally-Sharp DAgger for MCTS: Lower-Latency Monte Carlo Tree Search using Data Aggregation with Formal Methods

no code implementations • 15 Aug 2023 • Debraj Chakraborty, Damien Busatto-Gaston, Jean-François Raskin, Guillermo A. Pérez

In particular, we use model-checking techniques to guide the MCTS algorithm in order to generate offline samples of high-quality decisions on a representative set of states of the MDP.

Paper
Add Code

Synthesis of Hierarchical Controllers Based on Deep Reinforcement Learning Policies

no code implementations • 21 Feb 2024 • Florent Delgrange, Guy Avni, Anna Lukina, Christian Schilling, Ann Nowé, Guillermo A. Pérez

We propose a novel approach to the problem of controller design for environments modeled as Markov decision processes (MDPs).

reinforcement-learning

Paper
Add Code

Active Learning of Mealy Machines with Timers

no code implementations • 4 Mar 2024 • Véronique Bruyère, Bharat Garhewal, Guillermo A. Pérez, Gaëtan Staquet, Frits W. Vaandrager

However, whereas Waga needs exponentially many concrete queries to implement a single symbolic query, we only need a polynomial number.

Active Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.