no code implementations • 28 May 2022 • Remo Sasso, Matthia Sabatelli, Marco A. Wiering
A crucial challenge in reinforcement learning is to reduce the number of interactions with the environment that an agent requires to master a given task.
no code implementations • 14 Aug 2021 • Remo Sasso, Matthia Sabatelli, Marco A. Wiering
Reinforcement learning (RL) is well known for requiring large amounts of data in order for RL agents to learn to perform complex tasks.
Model-based Reinforcement Learning reinforcement-learning +2
no code implementations • 25 Oct 2020 • Hamid Radmard Rahmani, Carsten Koenke, Marco A. Wiering
In many reinforcement learning (RL) problems, it takes some time until a taken action by the agent reaches its maximum effect on the environment and consequently the agent receives the reward corresponding to that action by a delay called action-effect delay.
1 code implementation • 15 Jan 2020 • Mario S. Holubar, Marco A. Wiering
Different versions of two actor-critic learning algorithms are tested on this environment: Sampled Policy Gradient (SPG) and Proximal Policy Optimization (PPO).
3 code implementations • 1 Sep 2019 • Matthia Sabatelli, Gilles Louppe, Pierre Geurts, Marco A. Wiering
This paper makes one step forward towards characterizing a new family of \textit{model-free} Deep Reinforcement Learning (DRL) algorithms.
3 code implementations • 30 Sep 2018 • Matthia Sabatelli, Gilles Louppe, Pierre Geurts, Marco A. Wiering
We introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-Value (DQV) Learning.
2 code implementations • 15 Sep 2018 • Anton Orell Wiehe, Nil Stolt Ansó, Madalina M. Drugan, Marco A. Wiering
In this paper, a new offline actor-critic learning algorithm is introduced: Sampled Policy Gradient (SPG).