no code implementations • 17 Nov 2023 • Siemen Herremans, Ali Anwar, Arne Troch, Ian Ravijts, Maarten Vangeneugden, Siegfried Mercelis, Peter Hellinckx
The proposed methodology is based on a machine learning approach that has recently set benchmark results in various domains: model-based reinforcement learning.
no code implementations • 16 Nov 2023 • Astrid Vanneste, Simon Vanneste, Olivier Vasseur, Robin Janssens, Mattias Billast, Ali Anwar, Kevin Mets, Tom De Schepper, Siegfried Mercelis, Peter Hellinckx
We demonstrate our approach on two scenarios and compare the resulting path with path planning using a Frenet frame and path planning based on a proximal policy optimization (PPO) agent.
no code implementations • 9 Aug 2023 • Astrid Vanneste, Simon Vanneste, Kevin Mets, Tom De Schepper, Siegfried Mercelis, Peter Hellinckx
We do this comparison in the context of communication learning using gradients from other agents and perform tests on several environments.
no code implementations • 9 Aug 2023 • Astrid Vanneste, Thomas Somers, Simon Vanneste, Kevin Mets, Tom De Schepper, Siegfried Mercelis, Peter Hellinckx
Therefore, we analyse the communication protocol used by the agents that use the mean message encoder and can conclude that the agents use a combination of an exponential and a logarithmic function in their communication policy to avoid the loss of important information after applying the mean message encoder.
no code implementations • International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications 2022 • Akash Singh, Tom De Schepper, Kevin Mets, Peter Hellinckx, Jose Oramas, Steven Latre
The proposed method achieves an improvement of around 1. 49% mAP in atomic action recognition and 17. 57% mAP in composite action recognition, over a I3D-NL baseline, on the CATER dataset.
Ranked #1 on Atomic action recognition on CATER (using extra training data)
no code implementations • 12 Apr 2022 • Astrid Vanneste, Simon Vanneste, Kevin Mets, Tom De Schepper, Siegfried Mercelis, Steven Latré, Peter Hellinckx
The most common approach to allow learned communication between agents is the use of a differentiable communication channel that allows gradients to flow between agents as a form of feedback.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • Benelux Conference on Artificial Intelligence 2022 • Akash Singh, Tom De Schepper, Kevin Mets, Peter Hellinckx, Jose ́ Oramas, Steven Latre ́
In this paper, we propose DCapsQN, a task-independent CapsNets-based architecture in the deep reinforcement learning setting.
no code implementations • 29 Oct 2021 • Simon Vanneste, Gauthier de Borrekens, Stig Bosmans, Astrid Vanneste, Kevin Mets, Siegfried Mercelis, Steven Latré, Peter Hellinckx
In this paper, we investigate independent Q-learning (IQL) without communication and differentiable inter-agent learning (DIAL) with learned communication on an adaptive traffic control system (ATCS).
no code implementations • 29 Oct 2021 • Astrid Vanneste, Wesley Van Wijnsberghe, Simon Vanneste, Kevin Mets, Siegfried Mercelis, Steven Latré, Peter Hellinckx
We look at the difference in performance between communication that is private for a team and communication that can be overheard by the other team.
Multi-agent Reinforcement Learning reinforcement-learning +1
1 code implementation • 13 Oct 2021 • Thomas Cassimon, Reinout Eyckerman, Siegfried Mercelis, Steven Latré, Peter Hellinckx
In this paper, the authors investigate the Deep Sea Treasure (DST) problem as proposed by Vamplew et al.
no code implementations • 29 Sep 2021 • Louis Bagot, Kevin Mets, Tom De Schepper, Peter Hellinckx, Steven Latre
As an alternative to the widespread method of a weighted sum of rewards, Explore Options let the agent call an intrinsically motivated agent in order to observe and learn from interesting behaviors in the environment.
no code implementations • 7 Oct 2020 • Wim Casteels, Peter Hellinckx
The resulting algorithm favours correlations that are universal over the subpopulations and indeed a better performance is obtained on an out-of-distribution test set with respect to a more conventional l_2-regularization.
no code implementations • 12 Jun 2020 • Simon Vanneste, Astrid Vanneste, Kevin Mets, Tom De Schepper, Ali Anwar, Siegfried Mercelis, Steven Latré, Peter Hellinckx
The credit assignment problem, the non-stationarity of the communication environment and the creation of influenceable agents are major challenges within this research field which need to be overcome in order to learn a valid communication protocol.