no code implementations • 27 Mar 2025 • Olivier Serris, Stéphane Doncieux, Olivier Sigaud
In the first, the agent is conditioned on both the current and final goals, while in the second, it is conditioned on the next two goals in the sequence.
no code implementations • 19 Mar 2025 • Mohamed Salim Aissi, Clemence Grislain, Mohamed Chetouani, Olivier Sigaud, Laure Soulier, Nicolas Thome
While Large Language Models (LLMs) excel at reasoning on text and Vision-Language Models (VLMs) are highly effective for visual perception, applying those models for visual instruction-based planning remains a widely open problem.
1 code implementation • 11 Feb 2025 • Loris Gaven, Thomas Carta, Clément Romac, Cédric Colas, Sylvain Lamprier, Olivier Sigaud, Pierre-Yves Oudeyer
Open-ended learning agents must efficiently prioritize goals in vast possibility spaces, focusing on those that maximize learning progress (LP).
no code implementations • 25 Oct 2024 • Mohamed Salim Aissi, Clement Romac, Thomas Carta, Sylvain Lamprier, Pierre-Yves Oudeyer, Olivier Sigaud, Laure Soulier, Nicolas Thome
Finally, we propose to use a contrastive loss to mitigate this sensitivity and improve the robustness and generalization capabilities of LLMs.
no code implementations • 16 Oct 2024 • Loris Gaven, Clement Romac, Thomas Carta, Sylvain Lamprier, Olivier Sigaud, Pierre-Yves Oudeyer
The past years have seen Large Language Models (LLMs) strive not only as generative models but also as agents solving textual sequential decision-making tasks.
no code implementations • 24 Sep 2024 • Theo Cachet, Christopher R. Dance, Olivier Sigaud
Vision-language models (VLMs) have tremendous potential for grounding language, and thus enabling language-conditioned agents (LCAs) to perform diverse tasks specified with text.
1 code implementation • 2 Jul 2024 • Zakariae El Asri, Olivier Sigaud, Nicolas Thome
Applying reinforcement learning (RL) to real-world applications requires addressing a trade-off between asymptotic performance, sample efficiency, and inference time.
no code implementations • 14 Feb 2024 • Alexandre Chenu, Olivier Serris, Olivier Sigaud, Nicolas Perrin-Gilbert
Demonstrations are commonly used to speed up the learning process of Deep Reinforcement Learning algorithms.
no code implementations • 1 Nov 2023 • Olivier Sigaud, Gianluca Baldassarre, Cedric Colas, Stephane Doncieux, Richard Duro, Pierre-Yves Oudeyer, Nicolas Perrin-Gilbert, Vieri Giuliano Santucci
A lot of recent machine learning research papers have ``open-ended learning'' in their title.
1 code implementation • 29 Sep 2023 • Clémence Grislain, Hugo Caselles-Dupré, Olivier Sigaud, Mohamed Chetouani
To this end, human teachers seem to build mental models of the learner's internal state, a capacity known as Theory of Mind (ToM).
no code implementations • 18 Aug 2023 • Hugo Caselles-Dupré, Olivier Sigaud, Mohamed Chetouani
We introduce a novel category of GC-agents capable of functioning as both teachers and learners.
3 code implementations • 6 Feb 2023 • Thomas Carta, Clément Romac, Thomas Wolf, Sylvain Lamprier, Olivier Sigaud, Pierre-Yves Oudeyer
Using an interactive textual environment designed to study higher-level forms of functional grounding, and a set of spatial and navigation tasks, we study several scientific questions: 1) Can LLMs boost sample efficiency for online learning of various RL tasks?
1 code implementation • 9 Nov 2022 • Alexandre Chenu, Olivier Serris, Olivier Sigaud, Nicolas Perrin-Gilbert
This sequential goal-reaching approach raises a problem of compatibility between successive goals: we need to ensure that the state resulting from reaching a goal is compatible with the achievement of the following goals.
no code implementations • 26 Sep 2022 • Hugo Caselles-Dupré, Olivier Sigaud, Mohamed Chetouani
Teaching an agent to perform new tasks using natural language can easily be hindered by ambiguities in interpretation.
no code implementations • 16 Aug 2022 • Maël Franceschetti, Coline Lacoux, Ryan Ohouens, Antonin Raffin, Olivier Sigaud
A lot of these papers report poor performance on SWIMMER from RL methods and much better performance from direct policy search methods.
1 code implementation • 20 Jun 2022 • Thomas Carta, Pierre-Yves Oudeyer, Olivier Sigaud, Sylvain Lamprier
Reinforcement learning (RL) in long horizon and sparse reward tasks is notoriously difficult and requires a lot of training steps.
no code implementations • 14 Jun 2022 • Nicolas Castanet, Sylvain Lamprier, Olivier Sigaud
In multi-goal Reinforcement Learning, an agent can share experience between related training tasks, resulting in better generalization for new tasks at test time.
1 code implementation • 9 Jun 2022 • Hugo Caselles-Dupré, Olivier Sigaud, Mohamed Chetouani
In this paper, we implement pedagogy and pragmatism mechanisms by leveraging a Bayesian model of Goal Inference from demonstrations (BGI).
1 code implementation • 15 Apr 2022 • Alexandre Chenu, Nicolas Perrin-Gilbert, Olivier Sigaud
In such context, Imitation Learning (IL) can be a powerful approach to bootstrap the learning process.
1 code implementation • 11 Apr 2022 • Ahmed Akakzia, Olivier Sigaud
However, these capabilities are highly constrained by their policy and goal space representations.
no code implementations • 26 Mar 2022 • Olivier Sigaud
Deep neuroevolution and deep Reinforcement Learning have received a lot of attention in the last years.
no code implementations • 28 Feb 2022 • Hugo Caselles-Dupré, Mohamed Chetouani, Olivier Sigaud
When demonstrating a task, human tutors pedagogically modify their behavior by either "showing" the task rather than just "doing" it (exaggerating on relevant parts of the demonstration) or by giving demonstrations that best disambiguate the communicated goal.
1 code implementation • 10 Feb 2022 • Ahmed Akakzia, Olivier Serris, Olivier Sigaud, Cédric Colas
In the quest for autonomous agents learning open-ended repertoires of skills, most works take a Piagetian perspective: learning trajectories are the results of interactions between developmental agents and their physical environment.
no code implementations • 25 May 2021 • Olivier Sigaud, Ahmed Akakzia, Hugo Caselles-Dupré, Cédric Colas, Pierre-Yves Oudeyer, Mohamed Chetouani
In the field of Artificial Intelligence, these extremes respectively map to autonomous agents learning from their own signals and interactive learning agents fully taught by their teachers.
no code implementations • 10 Apr 2021 • Alexandre Chenu, Nicolas Perrin-Gilbert, Stéphane Doncieux, Olivier Sigaud
In particular, we show empirically that, if the mapping is smooth enough, i. e. if two close policies in the parameter space lead to similar outcomes, then diversity algorithms tend to inherit exploration properties of MP algorithms.
no code implementations • 1 Jan 2021 • Thomas Pierrot, Valentin Macé, Jean-Baptiste Sevestre, Louis Monier, Alexandre Laterre, Nicolas Perrin, Karim Beguir, Olivier Sigaud
Very large action spaces constitute a critical challenge for deep Reinforcement Learning (RL) algorithms.
no code implementations • 1 Jan 2021 • Thomas Pierrot, Valentin Macé, Geoffrey Cideron, Nicolas Perrin, Karim Beguir, Olivier Sigaud
The QD part contributes structural biases by decoupling the search for diversity from the search for high return, resulting in efficient management of the exploration-exploitation trade-off.
no code implementations • 17 Dec 2020 • Cédric Colas, Tristan Karch, Olivier Sigaud, Pierre-Yves Oudeyer
Developmental RL is concerned with the use of deep RL algorithms to tackle a developmental problem -- the $intrinsically$ $motivated$ $acquisition$ $of$ $open$-$ended$ $repertoires$ $of$ $skills$.
no code implementations • 29 Nov 2020 • Louis Monier, Jakub Kmec, Alexandre Laterre, Thomas Pierrot, Valentin Courgeau, Olivier Sigaud, Karim Beguir
Offline Reinforcement Learning (RL) aims to turn large datasets into powerful decision-making engines without any online interactions with the environment.
no code implementations • 27 Jul 2020 • Thomas Pierrot, Nicolas Perrin, Feryal Behbahani, Alexandre Laterre, Olivier Sigaud, Karim Beguir, Nando de Freitas
Third, the self-models are harnessed to learn recursive compositional programs with multiple levels of abstraction.
1 code implementation • NeurIPS 2021 • Thomas Pierrot, Valentin Macé, Félix Chalumeau, Arthur Flajolet, Geoffrey Cideron, Karim Beguir, Antoine Cully, Olivier Sigaud, Nicolas Perrin-Gilbert
This paper proposes a novel algorithm, QDPG, which combines the strength of Policy Gradient algorithms and Quality Diversity approaches to produce a collection of diverse and high-performing neural policies in continuous control environments.
no code implementations • ICML Workshop LaReL 2020 • Cédric Colas, Ahmed Akakzia, Pierre-Yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
In the real world, linguistic agents are also embodied agents: they perceive and act in the physical world.
no code implementations • 12 Jun 2020 • Cédric Colas, Ahmed Akakzia, Pierre-Yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
In the real world, linguistic agents are also embodied agents: they perceive and act in the physical world.
1 code implementation • ICLR 2021 • Ahmed Akakzia, Cédric Colas, Pierre-Yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
In a second stage (L -> G), it trains a language-conditioned goal generator to generate semantic goals that match the constraints expressed in language-based inputs.
no code implementations • 13 May 2020 • Stephane Doncieux, Nicolas Bredeche, Léni Le Goff, Benoît Girard, Alexandre Coninx, Olivier Sigaud, Mehdi Khamassi, Natalia Díaz-Rodríguez, David Filliat, Timothy Hospedales, A. Eiben, Richard Duro
Robots are still limited to controlled conditions, that the robot designer knows with enough details to endow the robot with the appropriate models or behaviors.
no code implementations • 24 Apr 2020 • Guillaume Matheron, Nicolas Perrin, Olivier Sigaud
In this paper, we propose a new algorithm called "Plan, Backplay, Chain Skills" (PBCS) that combines motion planning and reinforcement learning to solve hard exploration environments.
1 code implementation • 11 Feb 2020 • Aloïs Pourchot, Alexis Ducarouge, Olivier Sigaud
Weight-sharing (WS) has recently emerged as a paradigm to accelerate the automated search for efficient neural architectures, a process dubbed Neural Architecture Search (NAS).
no code implementations • 26 Nov 2019 • Guillaume Matheron, Nicolas Perrin, Olivier Sigaud
In environments with continuous state and action spaces, state-of-the-art actor-critic reinforcement learning algorithms can solve very complex problems, yet can also fail in environments that seem trivial, but the reason for such failures is still poorly understood.
1 code implementation • NeurIPS 2019 • Thomas Pierrot, Guillaume Ligner, Scott Reed, Olivier Sigaud, Nicolas Perrin, Alexandre Laterre, David Kas, Karim Beguir, Nando de Freitas
AlphaZero contributes powerful neural network guided search algorithms, which we augment with recursion.
2 code implementations • 15 Apr 2019 • Cédric Colas, Olivier Sigaud, Pierre-Yves Oudeyer
Consistently checking the statistical significance of experimental results is the first mandatory step towards reproducible science.
no code implementations • 19 Feb 2019 • Chenyang Zhao, Olivier Sigaud, Freek Stulp, Timothy M. Hospedales
Deep Reinforcement Learning has shown great success in a variety of control tasks.
no code implementations • 5 Feb 2019 • Anis Najar, Olivier Sigaud, Mohamed Chetouani
In this paper, we propose a framework that enables a human teacher to shape a robot behaviour by interactively providing it with unlabeled instructions.
no code implementations • 28 Jan 2019 • Pierre Fournier, Olivier Sigaud, Cédric Colas, Mohamed Chetouani
In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, and where an apt agent Bob acts independently, with non-observable intentions.
no code implementations • 18 Oct 2018 • Thomas Pierrot, Nicolas Perrin, Olivier Sigaud
In this paper, we provide an overview of first-order and second-order variants of the gradient descent method that are commonly used in machine learning.
1 code implementation • 15 Oct 2018 • Cédric Colas, Pierre Fournier, Olivier Sigaud, Mohamed Chetouani, Pierre-Yves Oudeyer
In open-ended environments, autonomous learning agents must set their own goals and build their own curriculum through an intrinsically motivated exploration.
no code implementations • 11 Oct 2018 • Nicolas Le Hir, Olivier Sigaud, Alban Laflaquière
Our model is based on processing the unsupervised interaction of an artificial agent with its environment.
2 code implementations • 2 Oct 2018 • Aloïs Pourchot, Olivier Sigaud
In this paper, we propose a different combination scheme using the simple cross-entropy method (CEM) and Twin Delayed Deep Deterministic policy gradient (td3), another off-policy deep RL algorithm which improves over ddpg.
no code implementations • 17 Aug 2018 • Aloïs Pourchot, Nicolas Perrin, Olivier Sigaud
Then, from an empirical comparison based on a simple benchmark, we show that, though it actually provides better sample efficiency, it is still far from the sample efficiency of deep reinforcement learning, though it is more stable.
2 code implementations • 25 Jun 2018 • Pierre Fournier, Olivier Sigaud, Mohamed Chetouani, Pierre-Yves Oudeyer
In this paper, we investigate a new form of automated curriculum learning based on adaptive selection of accuracy requirements, called accuracy-based curriculum learning.
1 code implementation • 21 Jun 2018 • Cédric Colas, Olivier Sigaud, Pierre-Yves Oudeyer
Consistently checking the statistical significance of experimental results is one of the mandatory methodological steps to address the so-called "reproducibility crisis" in deep reinforcement learning.
no code implementations • 13 Mar 2018 • Olivier Sigaud, Freek Stulp
Continuous action policy search is currently the focus of intensive research, driven both by the recent success of deep reinforcement learning algorithms and the emergence of competitors based on evolutionary algorithms.
1 code implementation • ICLR 2018 • Alexandre Péré, Sébastien Forestier, Olivier Sigaud, Pierre-Yves Oudeyer
Intrinsically motivated goal exploration algorithms enable machines to discover repertoires of policies that produce a diversity of effects in complex environments.
1 code implementation • ICML 2018 • Cédric Colas, Olivier Sigaud, Pierre-Yves Oudeyer
In continuous action domains, standard deep reinforcement learning algorithms like DDPG suffer from inefficient exploration when facing sparse or deceptive reward problems.
1 code implementation • 29 Jun 2016 • Arnaud de Froissard de Broissia, Olivier Sigaud
Sample efficiency is a critical property when optimizing policy parameters for the controller of a robot.
no code implementations • 10 Dec 2015 • Olivier Sigaud, Clément Masson, David Filliat, Freek Stulp
Gated networks are networks that contain gating connections, in which the outputs of at least two neurons are multiplied.
no code implementations • 18 Jun 2012 • Freek Stulp, Olivier Sigaud
There has been a recent focus in reinforcement learning on addressing continuous state and action problems by optimizing parameterized policies.