1 code implementation • 29 Oct 2024 • Koen Ponse, Aske Plaat, Niki van Stein, Thomas M. Moerland
Accurate economic simulations often require many experimental runs, particularly when combined with reinforcement learning.
no code implementations • 19 Aug 2024 • Zhao Yang, Thomas M. Moerland, Mike Preuss, Aske Plaat, Edward S. Hu
However, the training process of RL is far from automatic, requiring extensive human effort to reset the agent and environments.
no code implementations • 26 Jul 2024 • Koen Ponse, Felix Kleuker, Márton Fejér, Álvaro Serra-Gómez, Aske Plaat, Thomas Moerland
Afterwards, we zoom out and identify overarching reinforcement learning themes that appear throughout sustainability, such as multi-agent, offline, and safe reinforcement learning.
no code implementations • 16 Jul 2024 • Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Back
The field started with the question whether LLMs can solve grade school math word problems.
1 code implementation • 21 May 2024 • Andreas W M Sauter, Erman Acar, Aske Plaat
CausalPlayground offers fine-grained control over SCMs, interventions, and the generation of datasets of SCMs for learning and quantitative research.
no code implementations • 20 Mar 2024 • Aske Plaat
This study takes a closer look at a depth-first algorithm, AB, and a best-first algorithm, SSS.
no code implementations • 11 Mar 2024 • Michiel van der Meer, Enrico Liscio, Catholijn M. Jonker, Aske Plaat, Piek Vossen, Pradeep K. Murukannaiah
We find that, on the one hand, HyEnA achieves higher coverage and precision than a state-of-the-art automated method when compared to a common set of diverse opinions, justifying the need for human insight.
1 code implementation • 10 Feb 2024 • Annie Wong, Jacob de Nobel, Thomas Bäck, Aske Plaat, Anna V. Kononova
We benchmark both deep policy networks and networks consisting of a single linear layer from observations to actions for three gradient-based methods, such as Proximal Policy Optimization.
1 code implementation • 30 Jan 2024 • Andreas W. M. Sauter, Nicolò Botteghi, Erman Acar, Aske Plaat
Causal discovery is the challenging task of inferring causal structure from data.
1 code implementation • 18 Jan 2024 • Riccardo Majellaro, Jonathan Collu, Aske Plaat, Thomas M. Moerland
Extracting structured representations from raw visual data is an important and long-standing challenge in machine learning.
1 code implementation • 17 Nov 2023 • Thomas M. Moerland, Matthias Müller-Brockhausen, Zhao Yang, Andrius Bernatavicius, Koen Ponse, Tom Kouwenhoven, Andreas Sauter, Michiel van der Meer, Bram Renting, Aske Plaat
To solve this issue we introduce EduGym, a set of educational reinforcement learning environments and associated interactive notebooks tailored for education.
1 code implementation • 22 Oct 2023 • Mike Huisman, Thomas M. Moerland, Aske Plaat, Jan N. van Rijn
Meta-learning overcomes this limitation by learning how to learn.
1 code implementation • 13 Oct 2023 • Mike Huisman, Aske Plaat, Jan N. van Rijn
Gradient-based meta-learning techniques aim to distill useful prior knowledge from a set of training tasks such that new tasks can be learned more efficiently with gradient descent.
1 code implementation • 9 Oct 2023 • Mike Huisman, Aske Plaat, Jan N. van Rijn
Whilst meta-learning techniques have been observed to be successful at this in various scenarios, recent results suggest that when evaluated on tasks from a different data distribution than the one used for training, a baseline that simply finetunes a pre-trained network may be more effective than more complicated meta-learning techniques such as MAML, which is one of the most popular meta-learning techniques.
no code implementations • 4 Sep 2023 • Joost Broekens, Bernhard Hilpert, Suzan Verberne, Kim Baraka, Patrick Gebhard, Aske Plaat
Large language models, in particular generative pre-trained transformers (GPTs), show impressive results on a wide variety of language-related tasks.
no code implementations • 20 Apr 2023 • Zhao Yang, Thomas. M. Moerland, Mike Preuss, Aske Plaat
While deep reinforcement learning has shown important empirical success, it tends to learn relatively slow due to slow propagation of rewards information and slow update of parametric neural networks.
no code implementations • 6 Dec 2022 • Zhao Yang, Thomas M. Moerland, Mike Preuss, Aske Plaat
In this paper, we present a clear ablation study of post-exploration in a general intrinsically motivated goal exploration process (IMGEP) framework, that the Go-Explore paper did not show.
no code implementations • 28 Nov 2022 • Zhao Yang, Thomas M. Moerland, Mike Preuss, Aske Plaat
Therefore, this paper introduces Continuous Episodic Control (CEC), a novel non-parametric episodic memory algorithm for sequential decision making in problems with a continuous action space.
no code implementations • 29 Mar 2022 • Zhao Yang, Thomas M. Moerland, Mike Preuss, Aske Plaat
Go-Explore achieved breakthrough performance on challenging reinforcement learning (RL) tasks with sparse rewards.
1 code implementation • 7 Mar 2022 • Joery A. de Vries, Thomas M. Moerland, Aske Plaat
To improve our fundamental understanding of HRL, we investigate hierarchical credit assignment from the perspective of conventional multistep reinforcement learning.
Hierarchical Reinforcement Learning
reinforcement-learning
+2
no code implementations • 2 Mar 2022 • Matthias Müller-Brockhausen, Aske Plaat, Mike Preuss
Reinforcement Learning (RL) is one of the most dynamic research areas in Game AI and AI as a whole, and a wide variety of games are used as its prominent test problems.
no code implementations • 4 Jan 2022 • Aske Plaat
The aim of this book is to provide a comprehensive overview of the field of deep reinforcement learning.
no code implementations • 10 Sep 2021 • Zhao Yang, Mike Preuss, Aske Plaat
While previous work has investigated the use of expert knowledge to generate potential functions, in this work, we study whether we can use a search algorithm(A*) to automatically generate a potential function for reward shaping in Sokoban, a well-known planning task.
no code implementations • 17 Jul 2021 • Aske Plaat, Walter Kosters, Mike Preuss
Deep reinforcement learning has shown remarkable success in the past few years.
no code implementations • 29 Jun 2021 • Annie Wong, Thomas Bäck, Anna V. Kononova, Aske Plaat
This paper surveys the field of deep multiagent reinforcement learning.
no code implementations • 31 May 2021 • Matthias Müller-Brockhausen, Mike Preuss, Aske Plaat
We note a surprisingly late adoption of deep learning that starts in 2018.
no code implementations • 25 May 2021 • Zhao Yang, Mike Preuss, Aske Plaat
In reinforcement learning, learning actions for a behavior policy that can be applied to new environments is still a challenge, especially for tasks that involve much planning.
no code implementations • 13 May 2021 • Hui Wang, Mike Preuss, Aske Plaat
AlphaZero has achieved impressive performance in deep reinforcement learning by utilizing an architecture that combines search and training of a neural network in self-play.
1 code implementation • 21 Apr 2021 • Mike Huisman, Aske Plaat, Jan N. van Rijn
Deep learning typically requires large data sets and much compute power for each new problem that is learned.
1 code implementation • ICML Workshop URL 2021 • Joery A. de Vries, Ken S. Voskuil, Thomas M. Moerland, Aske Plaat
In contrast to standard forward dynamics models that predict a full next state, value equivalent models are trained to predict a future value, thereby emphasizing value relevant information in the representations.
1 code implementation • 13 Oct 2020 • Michael Emmerich, Joost Nibbeling, Marios Kefalas, Aske Plaat
The general problem in this paper is vertex (node) subset selection with the goal to contain an infection that spreads in a network.
no code implementations • 7 Oct 2020 • Mike Huisman, Jan N. van Rijn, Aske Plaat
Meta-learning is one approach to address this issue, by enabling the network to learn how to learn.
no code implementations • 11 Aug 2020 • Aske Plaat, Walter Kosters, Mike Preuss
In recent years, many model-based methods have been introduced to address this challenge.
no code implementations • 30 Jun 2020 • Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
Two key approaches to this problem are reinforcement learning (RL) and planning.
no code implementations • 26 Jun 2020 • Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
Therefore, this paper presents a unifying algorithmic framework for reinforcement learning and planning (FRAP), which identifies underlying dimensions on which MDP planning and learning algorithms have to decide.
no code implementations • 14 Jun 2020 • Hui Wang, Mike Preuss, Michael Emmerich, Aske Plaat
A later algorithm, Nested Rollout Policy Adaptation, was able to find a new record of 82 steps, albeit with large computational resources.
1 code implementation • 19 May 2020 • Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
Monte Carlo Tree Search (MCTS) efficiently balances exploration and exploitation in tree search based on count-derived uncertainty.
no code implementations • 26 Apr 2020 • Hui Wang, Mike Preuss, Aske Plaat
Recently, AlphaZero has achieved landmark results in deep reinforcement learning, by providing a single self-play architecture that learned three different games at super human level.
no code implementations • 1 Apr 2020 • Matthias Muller-Brockhausen, Mike Preuss, Aske Plaat
This paper focuses on a new game, Tetris Link, a board game that is still lacking any scientific analysis.
no code implementations • 12 Mar 2020 • Hui Wang, Michael Emmerich, Mike Preuss, Aske Plaat
A secondary result of our experiments concerns the choice of optimization goals, for which we also provide recommendations.
1 code implementation • 19 Mar 2019 • Hui Wang, Michael Emmerich, Mike Preuss, Aske Plaat
Therefore, in this paper, we choose 12 parameters in AlphaZero and evaluate how these parameters contribute to training.
1 code implementation • 14 Oct 2018 • Hui Wang, Michael Emmerich, Aske Plaat
For small games, simple classical table-based Q-learning might still be the algorithm of choice.
2 code implementations • 24 May 2018 • Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
A core novelty of Alpha Zero is the interleaving of tree search and deep learning, which has proven very successful in board games like Chess, Shogi and Go.
2 code implementations • 23 May 2018 • Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker
Asymmetric termination of search trees introduces a type of uncertainty for which the standard upper confidence bound (UCB) formula does not account.
2 code implementations • 16 Feb 2018 • Hui Wang, Michael Emmerich, Aske Plaat
GGP problems can be solved by reinforcement learning.
1 code implementation • 23 Aug 2017 • Christian M. Fuchs, Todor Stefanov, Nadia Murillo, Aske Plaat
Modern embedded technology is a driving factor in satellite miniaturization, contributing to a massive boom in satellite launches and a rapidly evolving new space industry.
Distributed, Parallel, and Cluster Computing Operating Systems
no code implementations • 2 Apr 2017 • S. Ali Mirsoleimani, Aske Plaat, Jaap van den Herik, Jos Vermaseren
In this paper, we present a new algorithm for parallel Monte Carlo tree search (MCTS).
no code implementations • 11 Feb 2017 • Aske Plaat, Jonathan Schaeffer, Wim Pijls, Arie de Bruin
The crucial step is the realization that transposition tables contain so-called solution trees, structures that are used in best-first search algorithms like SSS*.
no code implementations • 28 Sep 2015 • S. Ali Mirsoleimani, Aske Plaat, Jaap van den Herik
Small search trees occur in variations of MCTS, such as parallel and ensemble approaches.
no code implementations • 7 May 2015 • Aske Plaat, Jonathan Schaeffer, Wim Pijls, Arie de Bruin
Most practitioners use a variant of the Alpha-Beta algorithm, a simple depth-first pro- cedure, for searching minimax trees.
no code implementations • 11 Apr 2015 • Aske Plaat
In every discipline, large, diverse, and rich data sets are emerging, from astrophysics, to the life sciences, to the behavioral sciences, to finance and commerce, to the humanities and to the arts.
no code implementations • 18 Sep 2014 • Ben Ruijl, Aske Plaat, Jos Vermaseren, Jaap van den Herik
For High Energy Physics this means that numerical integrations that took weeks can now be done in hours.
no code implementations • 25 May 2014 • Ben Ruijl, Jos Vermaseren, Aske Plaat, Jaap van den Herik
We observe that a variable $C_p$ solves our domain: it yields more exploration at the bottom and as a result the tuning problem has been simplified.
1 code implementation • 5 Apr 2014 • Aske Plaat, Jonathan Schaeffer, Wim Pijls, Arie de Bruin
This paper introduces a new paradigm for minimax game-tree search algo- rithms.
no code implementations • 5 Apr 2014 • Aske Plaat
MTD(f) is a new minimax search algorithm, simpler and more efficient than previous algorithms.
no code implementations • 5 Apr 2014 • Aske Plaat, Jonathan Schaeffer, Wim Pijls, Arie de Bruin
Empirical evidence shows that in all three games, enhanced Alpha-Beta search is capable of building a tree that is close in size to that of the minimal graph.
no code implementations • 5 Apr 2014 • Aske Plaat, Jonathan Schaeffer, Wim Pijls, Arie de Bruin
AB-SSS* is comparable in performance to Alpha-Beta on leaf node count in all three games, making it a viable alternative to Alpha-Beta in practise.
no code implementations • 3 Dec 2013 • Ben Ruijl, Jos Vermaseren, Aske Plaat, Jaap van den Herik
Yet, this approach is fit for further improvements since it is sensitive to the so-called exploration-exploitation constant $C_p$ and the number of tree updates $N$.