Search Results for author: Pedro Ortega

Found 5 papers, 1 papers with code

Causal Reasoning from Meta-reinforcement Learning

1 code implementation • ICLR 2019 • Ishita Dasgupta, Jane Wang, Silvia Chiappa, Jovana Mitrovic, Pedro Ortega, David Raposo, Edward Hughes, Peter Battaglia, Matthew Botvinick, Zeb Kurth-Nelson

Discovering and exploiting the causal structure in the environment is a crucial challenge for intelligent agents.

counterfactual Meta Reinforcement Learning +2

Paper
Code

A Nonparametric Conjugate Prior Distribution for the Maximizing Argument of a Noisy Function

no code implementations • NeurIPS 2012 • Pedro Ortega, Jordi Grau-Moya, Tim Genewein, David Balduzzi, Daniel Braun

We propose a novel Bayesian approach to solve stochastic optimization problems that involve ﬁnding extrema of noisy, nonlinear functions.

Stochastic Optimization

Paper
Add Code

From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization

no code implementations • 19 Feb 2020 • Julien Perolat, Remi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls

In this paper we investigate the Follow the Regularized Leader dynamics in sequential imperfect information games (IIG).

Paper
Add Code

Your Policy Regularizer is Secretly an Adversary

no code implementations • 23 Mar 2022 • Rob Brekelmans, Tim Genewein, Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Shane Legg, Pedro Ortega

Policy regularization methods such as maximum entropy regularization are widely used in reinforcement learning to improve the robustness of a learned policy.

Paper
Add Code

Beyond Bayes-optimality: meta-learning what you know you don't know

no code implementations • 30 Sep 2022 • Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Tim Genewein, Elliot Catt, Kevin Li, Anian Ruoss, Chris Cundy, Joel Veness, Jane Wang, Marcus Hutter, Christopher Summerfield, Shane Legg, Pedro Ortega

This is in contrast to risk-sensitive agents, which additionally exploit the higher-order moments of the return, and ambiguity-sensitive agents, which act differently when recognizing situations in which they lack knowledge.

Decision Making Meta-Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.