Search Results for author: Paulo Rauber

Found 4 papers, 3 papers with code

Posterior Sampling for Deep Reinforcement Learning

1 code implementation30 Apr 2023 Remo Sasso, Michelangelo Conserva, Paulo Rauber

Despite remarkable successes, deep reinforcement learning algorithms remain sample inefficient: they require an enormous amount of trial and error to find good policies.

Computational Efficiency Model-based Reinforcement Learning +2

Hardness in Markov Decision Processes: Theory and Practice

no code implementations24 Oct 2022 Michelangelo Conserva, Paulo Rauber

Second, we introduce Colosseum, a pioneering package that enables empirical hardness analysis and implements a principled benchmark composed of environments that are diverse with respect to different measures of hardness.

reinforcement-learning Reinforcement Learning (RL)

Recurrent Neural-Linear Posterior Sampling for Nonstationary Contextual Bandits

1 code implementation9 Jul 2020 Aditya Ramesh, Paulo Rauber, Michelangelo Conserva, Jürgen Schmidhuber

An agent in a nonstationary contextual bandit problem should balance between exploration and the exploitation of (periodic or structured) patterns present in its previous experiences.

Multi-Armed Bandits

Hindsight policy gradients

1 code implementation ICLR 2019 Paulo Rauber, Avinash Ummadisingu, Filipe Mutz, Juergen Schmidhuber

A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy.

Policy Gradient Methods reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.