1 code implementation • 7 Aug 2023 • Felix Chalumeau, Bryan Lim, Raphael Boige, Maxime Allard, Luca Grillotti, Manon Flageat, Valentin Macé, Arthur Flajolet, Thomas Pierrot, Antoine Cully
QDax is an open-source library with a streamlined and modular API for Quality-Diversity (QD) optimization algorithms in Jax.
no code implementations • 20 Jul 2023 • Raphael Boige, Yannis Flet-Berliac, Arthur Flajolet, Guillaume Richard, Thomas Pierrot
Self-supervised learning has brought about a revolutionary paradigm shift in various computing domains, including NLP, vision, and biology.
1 code implementation • 9 Mar 2023 • Thomas Pierrot, Arthur Flajolet
Quality Diversity (QD) has emerged as a powerful alternative optimization paradigm that aims at generating large and diverse collections of solutions, notably with its flagship algorithm MAP-ELITES (ME) which evolves solutions through mutations and crossovers.
no code implementations • 24 Nov 2022 • Felix Chalumeau, Thomas Pierrot, Valentin Macé, Arthur Flajolet, Karim Beguir, Antoine Cully, Nicolas Perrin-Gilbert
Exploration is at the heart of several domains trying to solve control problems such as Reinforcement Learning and QD methods are promising candidates to overcome the challenges associated.
1 code implementation • 6 Oct 2022 • Felix Chalumeau, Raphael Boige, Bryan Lim, Valentin Macé, Maxime Allard, Arthur Flajolet, Antoine Cully, Thomas Pierrot
Recent work has shown that training a mixture of policies, as opposed to a single one, that are driven to explore different regions of the state-action space can address this shortcoming by generating a diverse set of behaviors, referred to as skills, that can be collectively used to great effect in adaptation tasks or for hierarchical planning.
1 code implementation • 17 Jun 2022 • Arthur Flajolet, Claire Bizon Monroc, Karim Beguir, Thomas Pierrot
Training populations of agents has demonstrated great promise in Reinforcement Learning for stabilizing training, improving exploration and asymptotic performance, and generating a diverse set of solutions.
1 code implementation • NeurIPS 2021 • Thomas Pierrot, Valentin Macé, Félix Chalumeau, Arthur Flajolet, Geoffrey Cideron, Karim Beguir, Antoine Cully, Olivier Sigaud, Nicolas Perrin-Gilbert
This paper proposes a novel algorithm, QDPG, which combines the strength of Policy Gradient algorithms and Quality Diversity approaches to produce a collection of diverse and high-performing neural policies in continuous control environments.
no code implementations • NeurIPS 2017 • Ofer Dekel, Arthur Flajolet, Nika Haghtalab, Patrick Jaillet
We show that the player can benefit from such a hint if the set of feasible actions is sufficiently round.
no code implementations • NeurIPS 2017 • Arthur Flajolet, Patrick Jaillet
We consider the problem of repeated bidding in online advertising auctions when some side information (e. g. browser cookies) is available ahead of submitting a bid in the form of a $d$-dimensional vector.
no code implementations • 20 Nov 2014 • Arthur Flajolet, Patrick Jaillet
In the convex optimization approach to online regret minimization, many methods have been developed to guarantee a $O(\sqrt{T})$ bound on regret for subdifferentiable convex loss functions with bounded subgradients, by using a reduction to linear loss functions.