Search Results for author: Olivier Sigaud

Found 49 papers, 20 papers with code

Single-Reset Divide & Conquer Imitation Learning

no code implementations14 Feb 2024 Alexandre Chenu, Olivier Serris, Olivier Sigaud, Nicolas Perrin-Gilbert

Demonstrations are commonly used to speed up the learning process of Deep Reinforcement Learning algorithms.

Imitation Learning

Utility-based Adaptive Teaching Strategies using Bayesian Theory of Mind

1 code implementation29 Sep 2023 Clémence Grislain, Hugo Caselles-Dupré, Olivier Sigaud, Mohamed Chetouani

To this end, human teachers seem to build mental models of the learner's internal state, a capacity known as Theory of Mind (ToM).

Enhancing Agent Communication and Learning through Action and Language

no code implementations18 Aug 2023 Hugo Caselles-Dupré, Olivier Sigaud, Mohamed Chetouani

We introduce a novel category of GC-agents capable of functioning as both teachers and learners.

Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning

3 code implementations6 Feb 2023 Thomas Carta, Clément Romac, Thomas Wolf, Sylvain Lamprier, Olivier Sigaud, Pierre-Yves Oudeyer

Using an interactive textual environment designed to study higher-level forms of functional grounding, and a set of spatial and navigation tasks, we study several scientific questions: 1) Can LLMs boost sample efficiency for online learning of various RL tasks?

Decision Making reinforcement-learning +1

Leveraging Sequentiality in Reinforcement Learning from a Single Demonstration

1 code implementation9 Nov 2022 Alexandre Chenu, Olivier Serris, Olivier Sigaud, Nicolas Perrin-Gilbert

This sequential goal-reaching approach raises a problem of compatibility between successive goals: we need to ensure that the state resulting from reaching a goal is compatible with the achievement of the following goals.

reinforcement-learning Reinforcement Learning (RL)

Making Reinforcement Learning Work on Swimmer

no code implementations16 Aug 2022 Maël Franceschetti, Coline Lacoux, Ryan Ohouens, Antonin Raffin, Olivier Sigaud

A lot of these papers report poor performance on SWIMMER from RL methods and much better performance from direct policy search methods.

reinforcement-learning Reinforcement Learning (RL)

EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL

1 code implementation20 Jun 2022 Thomas Carta, Pierre-Yves Oudeyer, Olivier Sigaud, Sylvain Lamprier

Reinforcement learning (RL) in long horizon and sparse reward tasks is notoriously difficult and requires a lot of training steps.

Question Answering Question Generation +2

Stein Variational Goal Generation for adaptive Exploration in Multi-Goal Reinforcement Learning

no code implementations14 Jun 2022 Nicolas Castanet, Sylvain Lamprier, Olivier Sigaud

In multi-goal Reinforcement Learning, an agent can share experience between related training tasks, resulting in better generalization for new tasks at test time.

Multi-Goal Reinforcement Learning reinforcement-learning +1

Pragmatically Learning from Pedagogical Demonstrations in Multi-Goal Environments

1 code implementation9 Jun 2022 Hugo Caselles-Dupré, Olivier Sigaud, Mohamed Chetouani

In this paper, we implement pedagogy and pragmatism mechanisms by leveraging a Bayesian model of Goal Inference from demonstrations (BGI).

Divide & Conquer Imitation Learning

1 code implementation15 Apr 2022 Alexandre Chenu, Nicolas Perrin-Gilbert, Olivier Sigaud

In such context, Imitation Learning (IL) can be a powerful approach to bootstrap the learning process.

Imitation Learning Inductive Bias

Learning Object-Centered Autotelic Behaviors with Graph Neural Networks

1 code implementation11 Apr 2022 Ahmed Akakzia, Olivier Sigaud

However, these capabilities are highly constrained by their policy and goal space representations.

Object

Combining Evolution and Deep Reinforcement Learning for Policy Search: a Survey

no code implementations26 Mar 2022 Olivier Sigaud

Deep neuroevolution and deep Reinforcement Learning have received a lot of attention in the last years.

reinforcement-learning Reinforcement Learning (RL)

Pedagogical Demonstrations and Pragmatic Learning in Artificial Tutor-Learner Interactions

no code implementations28 Feb 2022 Hugo Caselles-Dupré, Mohamed Chetouani, Olivier Sigaud

When demonstrating a task, human tutors pedagogically modify their behavior by either "showing" the task rather than just "doing" it (exaggerating on relevant parts of the demonstration) or by giving demonstrations that best disambiguate the communicated goal.

Help Me Explore: Minimal Social Interventions for Graph-Based Autotelic Agents

1 code implementation10 Feb 2022 Ahmed Akakzia, Olivier Serris, Olivier Sigaud, Cédric Colas

In the quest for autonomous agents learning open-ended repertoires of skills, most works take a Piagetian perspective: learning trajectories are the results of interactions between developmental agents and their physical environment.

Towards Teachable Autotelic Agents

no code implementations25 May 2021 Olivier Sigaud, Ahmed Akakzia, Hugo Caselles-Dupré, Cédric Colas, Pierre-Yves Oudeyer, Mohamed Chetouani

In the field of Artificial Intelligence, these extremes respectively map to autonomous agents learning from their own signals and interactive learning agents fully taught by their teachers.

Selection-Expansion: A Unifying Framework for Motion-Planning and Diversity Search Algorithms

no code implementations10 Apr 2021 Alexandre Chenu, Nicolas Perrin-Gilbert, Stéphane Doncieux, Olivier Sigaud

In particular, we show empirically that, if the mapping is smooth enough, i. e. if two close policies in the parameter space lead to similar outcomes, then diversity algorithms tend to inherit exploration properties of MP algorithms.

Motion Planning

Sample efficient Quality Diversity for neural continuous control

no code implementations1 Jan 2021 Thomas Pierrot, Valentin Macé, Geoffrey Cideron, Nicolas Perrin, Karim Beguir, Olivier Sigaud

The QD part contributes structural biases by decoupling the search for diversity from the search for high return, resulting in efficient management of the exploration-exploitation trade-off.

Continuous Control Management +1

Autotelic Agents with Intrinsically Motivated Goal-Conditioned Reinforcement Learning: a Short Survey

no code implementations17 Dec 2020 Cédric Colas, Tristan Karch, Olivier Sigaud, Pierre-Yves Oudeyer

Developmental RL is concerned with the use of deep RL algorithms to tackle a developmental problem -- the $intrinsically$ $motivated$ $acquisition$ $of$ $open$-$ended$ $repertoires$ $of$ $skills$.

reinforcement-learning Reinforcement Learning (RL)

Offline Reinforcement Learning Hands-On

no code implementations29 Nov 2020 Louis Monier, Jakub Kmec, Alexandre Laterre, Thomas Pierrot, Valentin Courgeau, Olivier Sigaud, Karim Beguir

Offline Reinforcement Learning (RL) aims to turn large datasets into powerful decision-making engines without any online interactions with the environment.

Behavioural cloning Decision Making +3

Learning Compositional Neural Programs for Continuous Control

no code implementations27 Jul 2020 Thomas Pierrot, Nicolas Perrin, Feryal Behbahani, Alexandre Laterre, Olivier Sigaud, Karim Beguir, Nando de Freitas

Third, the self-models are harnessed to learn recursive compositional programs with multiple levels of abstraction.

Continuous Control

Diversity Policy Gradient for Sample Efficient Quality-Diversity Optimization

1 code implementation NeurIPS 2021 Thomas Pierrot, Valentin Macé, Félix Chalumeau, Arthur Flajolet, Geoffrey Cideron, Karim Beguir, Antoine Cully, Olivier Sigaud, Nicolas Perrin-Gilbert

This paper proposes a novel algorithm, QDPG, which combines the strength of Policy Gradient algorithms and Quality Diversity approaches to produce a collection of diverse and high-performing neural policies in continuous control environments.

Continuous Control Evolutionary Algorithms

Grounding Language to Autonomously-Acquired Skills via Goal Generation

1 code implementation ICLR 2021 Ahmed Akakzia, Cédric Colas, Pierre-Yves Oudeyer, Mohamed Chetouani, Olivier Sigaud

In a second stage (L -> G), it trains a language-conditioned goal generator to generate semantic goals that match the constraints expressed in language-based inputs.

Language Acquisition

PBCS : Efficient Exploration and Exploitation Using a Synergy between Reinforcement Learning and Motion Planning

no code implementations24 Apr 2020 Guillaume Matheron, Nicolas Perrin, Olivier Sigaud

In this paper, we propose a new algorithm called "Plan, Backplay, Chain Skills" (PBCS) that combines motion planning and reinforcement learning to solve hard exploration environments.

Continuous Control Efficient Exploration +3

To Share or Not To Share: A Comprehensive Appraisal of Weight-Sharing

1 code implementation11 Feb 2020 Aloïs Pourchot, Alexis Ducarouge, Olivier Sigaud

Weight-sharing (WS) has recently emerged as a paradigm to accelerate the automated search for efficient neural architectures, a process dubbed Neural Architecture Search (NAS).

Neural Architecture Search

The problem with DDPG: understanding failures in deterministic environments with sparse rewards

no code implementations26 Nov 2019 Guillaume Matheron, Nicolas Perrin, Olivier Sigaud

In environments with continuous state and action spaces, state-of-the-art actor-critic reinforcement learning algorithms can solve very complex problems, yet can also fail in environments that seem trivial, but the reason for such failures is still poorly understood.

A Hitchhiker's Guide to Statistical Comparisons of Reinforcement Learning Algorithms

2 code implementations15 Apr 2019 Cédric Colas, Olivier Sigaud, Pierre-Yves Oudeyer

Consistently checking the statistical significance of experimental results is the first mandatory step towards reproducible science.

reinforcement-learning Reinforcement Learning (RL)

Interactively shaping robot behaviour with unlabeled human instructions

no code implementations5 Feb 2019 Anis Najar, Olivier Sigaud, Mohamed Chetouani

In this paper, we propose a framework that enables a human teacher to shape a robot behaviour by interactively providing it with unlabeled instructions.

Reinforcement Learning (RL)

CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments

no code implementations28 Jan 2019 Pierre Fournier, Olivier Sigaud, Cédric Colas, Mohamed Chetouani

In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, and where an apt agent Bob acts independently, with non-observable intentions.

reinforcement-learning Reinforcement Learning (RL) +1

First-order and second-order variants of the gradient descent in a unified framework

no code implementations18 Oct 2018 Thomas Pierrot, Nicolas Perrin, Olivier Sigaud

In this paper, we provide an overview of first-order and second-order variants of the gradient descent method that are commonly used in machine learning.

BIG-bench Machine Learning

CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning

1 code implementation15 Oct 2018 Cédric Colas, Pierre Fournier, Olivier Sigaud, Mohamed Chetouani, Pierre-Yves Oudeyer

In open-ended environments, autonomous learning agents must set their own goals and build their own curriculum through an intrinsically motivated exploration.

Efficient Exploration Multi-Goal Reinforcement Learning +2

Identification of Invariant Sensorimotor Structures as a Prerequisite for the Discovery of Objects

no code implementations11 Oct 2018 Nicolas Le Hir, Olivier Sigaud, Alban Laflaquière

Our model is based on processing the unsupervised interaction of an artificial agent with its environment.

Clustering

CEM-RL: Combining evolutionary and gradient-based methods for policy search

2 code implementations2 Oct 2018 Aloïs Pourchot, Olivier Sigaud

In this paper, we propose a different combination scheme using the simple cross-entropy method (CEM) and Twin Delayed Deep Deterministic policy gradient (td3), another off-policy deep RL algorithm which improves over ddpg.

Importance mixing: Improving sample reuse in evolutionary policy search methods

no code implementations17 Aug 2018 Aloïs Pourchot, Nicolas Perrin, Olivier Sigaud

Then, from an empirical comparison based on a simple benchmark, we show that, though it actually provides better sample efficiency, it is still far from the sample efficiency of deep reinforcement learning, though it is more stable.

reinforcement-learning Reinforcement Learning (RL)

Accuracy-based Curriculum Learning in Deep Reinforcement Learning

2 code implementations25 Jun 2018 Pierre Fournier, Olivier Sigaud, Mohamed Chetouani, Pierre-Yves Oudeyer

In this paper, we investigate a new form of automated curriculum learning based on adaptive selection of accuracy requirements, called accuracy-based curriculum learning.

reinforcement-learning Reinforcement Learning (RL)

How Many Random Seeds? Statistical Power Analysis in Deep Reinforcement Learning Experiments

1 code implementation21 Jun 2018 Cédric Colas, Olivier Sigaud, Pierre-Yves Oudeyer

Consistently checking the statistical significance of experimental results is one of the mandatory methodological steps to address the so-called "reproducibility crisis" in deep reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Policy Search in Continuous Action Domains: an Overview

no code implementations13 Mar 2018 Olivier Sigaud, Freek Stulp

Continuous action policy search is currently the focus of intensive research, driven both by the recent success of deep reinforcement learning algorithms and the emergence of competitors based on evolutionary algorithms.

Bayesian Optimization Evolutionary Algorithms +2

Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration

1 code implementation ICLR 2018 Alexandre Péré, Sébastien Forestier, Olivier Sigaud, Pierre-Yves Oudeyer

Intrinsically motivated goal exploration algorithms enable machines to discover repertoires of policies that produce a diversity of effects in complex environments.

Representation Learning

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

1 code implementation ICML 2018 Cédric Colas, Olivier Sigaud, Pierre-Yves Oudeyer

In continuous action domains, standard deep reinforcement learning algorithms like DDPG suffer from inefficient exploration when facing sparse or deceptive reward problems.

reinforcement-learning Reinforcement Learning (RL)

Actor-critic versus direct policy search: a comparison based on sample complexity

1 code implementation29 Jun 2016 Arnaud de Froissard de Broissia, Olivier Sigaud

Sample efficiency is a critical property when optimizing policy parameters for the controller of a robot.

Gated networks: an inventory

no code implementations10 Dec 2015 Olivier Sigaud, Clément Masson, David Filliat, Freek Stulp

Gated networks are networks that contain gating connections, in which the outputs of at least two neurons are multiplied.

Activity Recognition

Path Integral Policy Improvement with Covariance Matrix Adaptation

no code implementations18 Jun 2012 Freek Stulp, Olivier Sigaud

There has been a recent focus in reinforcement learning on addressing continuous state and action problems by optimizing parameterized policies.

Cannot find the paper you are looking for? You can Submit a new open access paper.