Search Results for author: Olivier Buffet

Found 18 papers, 0 papers with code

Optimally Solving Two-Agent Decentralized POMDPs Under One-Sided Information Sharing

no code implementations ICML 2020 Yuxuan Xie, Jilles Dibangoye, Olivier Buffet

Optimally solving decentralized partially observable Markov decision processes under either full or no information sharing received significant attention in recent years.

Vocal Bursts Valence Prediction

Optimally Solving Two-Agent Decentralized POMDPs Under One-Sided Information Sharing

no code implementations ICML 2020 Yuxuan Xie, Jilles Dibangoye, Olivier Buffet

Optimally solving decentralized partially observable Markov decision processes under either full or no information sharing received significant attention in recent years.

Vocal Bursts Valence Prediction

How to Exhibit More Predictable Behaviors

no code implementations17 Apr 2024 Salomé Lepers, Sophie Lemonnier, Vincent Thomas, Olivier Buffet

This paper looks at predictability problems, i. e., wherein an agent must choose its strategy in order to optimize the predictions that an external observer could make.

Solving Hierarchical Information-Sharing Dec-POMDPs: An Extensive-Form Game Approach

no code implementations5 Feb 2024 Johan Peralez, Aurélien Delage, Olivier Buffet, Jilles S. Dibangoye

A recent theory shows that a multi-player decentralized partially observable Markov decision process can be transformed into an equivalent single-player game, enabling the application of \citeauthor{bellman}'s principle of optimality to solve the single-player game by breaking it down into single-stage subgames.

Management

Monte-Carlo Search for an Equilibrium in Dec-POMDPs

no code implementations19 May 2023 Yang You, Vincent Thomas, Francis Colas, Olivier Buffet

Decentralized partially observable Markov decision processes (Dec-POMDPs) formalize the problem of designing individual controllers for a group of collaborative agents under stochastic dynamics and partial observability.

Robust Robot Planning for Human-Robot Collaboration

no code implementations27 Feb 2023 Yang You, Vincent Thomas, Francis Colas, Rachid Alami, Olivier Buffet

Based on this, we propose two contributions: 1) an approach to automatically generate an uncertain human behavior (a policy) for each given objective function while accounting for possible robot behaviors; and 2) a robot planning algorithm that is robust to the above-mentioned uncertainties and relies on solving a partially observable Markov decision process (POMDP) obtained by reasoning on a distribution over human behaviors.

HSVI can solve zero-sum Partially Observable Stochastic Games

no code implementations26 Oct 2022 Aurélien Delage, Olivier Buffet, Jilles S. Dibangoye, Abdallah Saffidine

State-of-the-art methods for solving 2-player zero-sum imperfect information games rely on linear programming or regret minimization, though not on dynamic programming (DP) or heuristic search (HS), while the latter are often at the core of state-of-the-art solvers for other sequential decision-making problems.

Decision Making Open-Ended Question Answering

HSVI for zs-POSGs using Concavity, Convexity and Lipschitz Properties

no code implementations25 Oct 2021 Aurélien Delage, Olivier Buffet, Jilles Dibangoye

Dynamic programming and heuristic search are at the core of state-of-the-art solvers for sequential decision-making problems.

Decision Making

Solving infinite-horizon Dec-POMDPs using Finite State Controllers within JESP

no code implementations17 Sep 2021 Yang You, Vincent Thomas, Francis Colas, Olivier Buffet

This paper looks at solving collaborative planning problems formalized as Decentralized POMDPs (Dec-POMDPs) by searching for Nash equilibria, i. e., situations where each agent's policy is a best response to the other agents' (fixed) policies.

Monte Carlo Information-Oriented Planning

no code implementations21 Mar 2021 Vincent Thomas, Gérémy Hutin, Olivier Buffet

In this article, we discuss how to solve information-gathering problems expressed as rho-POMDPs, an extension of Partially Observable Markov Decision Processes (POMDPs) whose reward rho depends on the belief state.

On Bellman's Optimality Principle for zs-POSGs

no code implementations29 Jun 2020 Olivier Buffet, Jilles Dibangoye, Aurélien Delage, Abdallah Saffidine, Vincent Thomas

Many non-trivial sequential decision-making problems are efficiently solved by relying on Bellman's optimality principle, i. e., exploiting the fact that sub-problems are nested recursively within the original problem.

Decision Making

Reinforcement Learning

no code implementations29 May 2020 Olivier Buffet, Olivier Pietquin, Paul Weng

Reinforcement learning (RL) is a general framework for adaptive control, which has proven to be efficient in many domains, e. g., board games, video games or autonomous vehicles.

Autonomous Vehicles Board Games +3

rho-POMDPs have Lipschitz-Continuous epsilon-Optimal Value Functions

no code implementations NeurIPS 2018 Mathieu Fehr, Olivier Buffet, Vincent Thomas, Jilles Dibangoye

In this paper, we focus on POMDPs and ρ-POMDPs with λ ρ -Lipschitz reward function, and demonstrate that, for finite horizons, the optimal value function is Lipschitz-continuous.

Learning to Act in Decentralized Partially Observable MDPs

no code implementations ICML 2018 Jilles Dibangoye, Olivier Buffet

We address a long-standing open problem of reinforcement learning in decentralized partially observable Markov decision processes.

Multi-agent Reinforcement Learning reinforcement-learning +1

POMDPs Make Better Hackers: Accounting for Uncertainty in Penetration Testing

no code implementations31 Jul 2013 Carlos Sarraute, Olivier Buffet, Joerg Hoffmann

Penetration Testing is a methodology for assessing network security, by generating and executing possible hacking attacks.

Les POMDP font de meilleurs hackers: Tenir compte de l'incertitude dans les tests de penetration

no code implementations30 Jul 2013 Carlos Sarraute, Olivier Buffet, Joerg Hoffmann

Penetration Testing is a methodology for assessing network security, by generating and executing possible hacking attacks.

Penetration Testing == POMDP Solving?

no code implementations19 Jun 2013 Carlos Sarraute, Olivier Buffet, Joerg Hoffmann

Penetration Testing is a methodology for assessing network security, by generating and executing possible attacks.

A POMDP Extension with Belief-dependent Rewards

no code implementations NeurIPS 2010 Mauricio Araya, Olivier Buffet, Vincent Thomas, Françcois Charpillet

Partially Observable Markov Decision Processes (POMDPs) model sequential decision-making problems under uncertainty and partial observability.

Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.