Search Results for author: Jordi Grau-Moya

Found 17 papers, 2 papers with code

Your Policy Regularizer is Secretly an Adversary

no code implementations23 Mar 2022 Rob Brekelmans, Tim Genewein, Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Shane Legg, Pedro Ortega

Policy regularization methods such as maximum entropy regularization are widely used in reinforcement learning to improve the robustness of a learned policy.

Model-Free Risk-Sensitive Reinforcement Learning

no code implementations4 Nov 2021 Grégoire Delétang, Jordi Grau-Moya, Markus Kunesch, Tim Genewein, Rob Brekelmans, Shane Legg, Pedro A. Ortega

Since the Gaussian free energy is known to be a certainty-equivalent sensitive to the mean and the variance, the learning rule has applications in risk-sensitive decision-making.

Decision Making reinforcement-learning

Stochastic Approximation of Gaussian Free Energy for Risk-Sensitive Reinforcement Learning

no code implementations NeurIPS 2021 Grégoire Delétang, Jordi Grau-Moya, Markus Kunesch, Tim Genewein, Rob Brekelmans, Shane Legg, Pedro A Ortega

Since the Gaussian free energy is known to be a certainty-equivalent sensitive to the mean and the variance, the learning rule has applications in risk-sensitive decision-making.

Decision Making reinforcement-learning

Bellman: A Toolbox for Model-Based Reinforcement Learning in TensorFlow

2 code implementations26 Mar 2021 John McLeod, Hrvoje Stojic, Vincent Adam, Dongho Kim, Jordi Grau-Moya, Peter Vrancx, Felix Leibfried

This paves the way for new research directions, e. g. investigating uncertainty-aware environment models that are not necessarily neural-network-based, or developing algorithms to solve industrially-motivated benchmarks that share characteristics with real-world problems.

Model-based Reinforcement Learning reinforcement-learning

Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning

no code implementations11 Sep 2019 Felix Leibfried, Jordi Grau-Moya

While this has been initially proposed for Markov Decision Processes (MDPs) in tabular settings, it was recently shown that a similar principle leads to significant improvements over vanilla SQL in RL for high-dimensional domains with discrete actions and function approximators.

Q-Learning

Balancing Two-Player Stochastic Games with Soft Q-Learning

no code implementations9 Feb 2018 Jordi Grau-Moya, Felix Leibfried, Haitham Bou-Ammar

Within the context of video games the notion of perfectly rational agents can be undesirable as it leads to uninteresting situations, where humans face tough adversarial decision makers.

Q-Learning reinforcement-learning

Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

no code implementations7 Apr 2016 Jordi Grau-Moya, Felix Leibfried, Tim Genewein, Daniel A. Braun

As limit cases, this generalized scheme includes standard value iteration with a known model, Bayesian MDP planning, and robust planning.

Adaptive information-theoretic bounded rational decision-making with parametric priors

no code implementations5 Nov 2015 Jordi Grau-Moya, Daniel A. Braun

Here we derive a sampling-based alternative update rule for the adaptation of prior behaviors of decision-makers and we show convergence to the optimal prior predicted by rate distortion theory.

Decision Making

Bounded Rational Decision-Making in Changing Environments

no code implementations24 Dec 2013 Jordi Grau-Moya, Daniel A. Braun

When this requirement is not fulfilled, the decision-maker will suffer inefficiencies in utility, that arise because the current policy is optimal for an environment in the past.

Decision Making

A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

no code implementations NeurIPS 2012 Pedro Ortega, Jordi Grau-Moya, Tim Genewein, David Balduzzi, Daniel Braun

We propose a novel Bayesian approach to solve stochastic optimization problems that involve finding extrema of noisy, nonlinear functions.

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.