Search Results for author: Philippe Preux

Found 35 papers, 6 papers with code

Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning

1 code implementation23 May 2024 Hector Kohler, Quentin Delfosse, Riad Akrour, Kristian Kersting, Philippe Preux

We empirically demonstrate that INTERPRETER compact tree programs match oracles across a diverse set of sequential decision tasks and evaluate the impact of our design choices on interpretability and performances.

Atari Games reinforcement-learning +1

Augmenting Ad-Hoc IR Dataset for Interactive Conversational Search

no code implementations10 Nov 2023 Pierre Erbacher, Jian-Yun Nie, Philippe Preux, Laure Soulier

The only two datasets known to us that contain both document relevance judgments and the associated clarification interactions are Qulac and ClariQ.

Conversational Search

Interpretable Decision Tree Search as a Markov Decision Process

no code implementations22 Sep 2023 Hector Kohler, Riad Akrour, Philippe Preux

Finding an optimal decision tree for a supervised learning task is a challenging combinatorial problem to solve at scale.

AdaStop: adaptive statistical testing for sound comparisons of Deep RL agents

no code implementations19 Jun 2023 Timothée Mathieu, Riccardo Della Vecchia, Alena Shilova, Matheus Medeiros Centa, Hector Kohler, Odalric-Ambrym Maillard, Philippe Preux

When comparing several RL algorithms, a major question is how many executions must be made and how can we ensure that the results of such a comparison are theoretically sound.

Reinforcement Learning (RL)

Optimal Interpretability-Performance Trade-off of Classification Trees with Black-Box Reinforcement Learning

no code implementations11 Apr 2023 Hector Kohler, Riad Akrour, Philippe Preux

A given supervised classification task is modeled as a Markov decision problem (MDP) and then augmented with additional actions that gather information about the features, equivalent to building a DT.

reinforcement-learning Reinforcement Learning (RL)

Soft Action Priors: Towards Robust Policy Transfer

no code implementations20 Sep 2022 Matheus Centa, Philippe Preux

Despite success in many challenging problems, reinforcement learning (RL) is still confronted with sample inefficiency, which can be mitigated by introducing prior knowledge to agents.

reinforcement-learning Reinforcement Learning +1

Interferometric Graph Transform for Community Labeling

no code implementations4 Jun 2021 Nathan Grinsztajn, Louis Leconte, Philippe Preux, Edouard Oyallon

We present a new approach for learning unsupervised node representations in community graphs.

Low-Rank Projections of GCNs Laplacian

no code implementations ICLR Workshop GTRL 2021 Nathan Grinsztajn, Philippe Preux, Edouard Oyallon

In this work, we study the behavior of standard models for community detection under spectral manipulations.

Community Detection

Don't Do What Doesn't Matter: Intrinsic Motivation with Action Usefulness

1 code implementation20 May 2021 Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

Sparse rewards are double-edged training signals in reinforcement learning: easy to design but hard to optimize.

Adversarially Guided Actor-Critic

1 code implementation ICLR 2021 Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist

Despite definite success in deep reinforcement learning problems, actor-critic algorithms are still confronted with sample inefficiency in complex environments, particularly in tasks where efficient exploration is a bottleneck.

Efficient Exploration

A spectral perspective on GCNs

no code implementations1 Jan 2021 Nathan Grinsztajn, Philippe Preux, Edouard Oyallon

In this work, we study the behavior of standard GCNs under spectral manipulations.

Geometric Deep Reinforcement Learning for Dynamic DAG Scheduling

1 code implementation9 Nov 2020 Nathan Grinsztajn, Olivier Beaumont, Emmanuel Jeannot, Philippe Preux

In this paper, we propose a reinforcement learning approach to solve a realistic scheduling problem, and apply it to an algorithm commonly executed in the high performance computing community, the Cholesky factorization.

Combinatorial Optimization reinforcement-learning +3

Learning Value Functions in Deep Policy Gradients using Residual Variance

no code implementations ICLR 2021 Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, Philippe Preux

We prove the theoretical consistency of the new gradient estimator and observe dramatic empirical improvement across a variety of continuous control tasks and algorithms.

continuous-control Continuous Control +1

A Machine of Few Words -- Interactive Speaker Recognition with Reinforcement Learning

no code implementations7 Aug 2020 Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

To do so, we cast the speaker recognition task into a sequential decision-making problem that we solve with Reinforcement Learning.

Decision Making reinforcement-learning +3

I'm sorry Dave, I'm afraid I can't do that, Deep Q-learning from forbidden action

no code implementations4 Oct 2019 Mathieu Seurin, Philippe Preux, Olivier Pietquin

Violating constraints thus results in rejected actions or entering in a safe mode driven by an external controller, making RL agents incapable of learning from their mistakes.

Industrial Robots Q-Learning +3

MERL: Multi-Head Reinforcement Learning

no code implementations26 Sep 2019 Yannis Flet-Berliac, Philippe Preux

In this paper: (a) We introduce and define MERL, the multi-head reinforcement learning framework we use throughout this work.

continuous-control Continuous Control +4

Samples Are Useful? Not Always: denoising policy gradient updates using variance explained

no code implementations25 Sep 2019 Yannis Flet-Berliac, Philippe Preux

In this work, Vex is used to evaluate the impact each transition will have on learning: this criterion refines sampling and improves the policy gradient algorithm.

continuous-control Continuous Control +1

Only Relevant Information Matters: Filtering Out Noisy Samples to Boost RL

no code implementations8 Apr 2019 Yannis Flet-Berliac, Philippe Preux

In this work, we use this metric to select samples that are useful to learn from, and we demonstrate that this selection can significantly improve the performance of policy gradient methods.

continuous-control Continuous Control +3

Visual Reasoning with Multi-hop Feature Modulation

1 code implementation ECCV 2018 Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron Courville, Olivier Pietquin

Recent breakthroughs in computer vision and natural language processing have spurred interest in challenging multi-modal tasks such as visual question-answering and visual dialogue.

Question Answering Visual Dialog +2

Recurrent Neural Networks for Long and Short-Term Sequential Recommendation

no code implementations23 Jul 2018 Kiewan Villatel, Elena Smirnova, Jérémie Mary, Philippe Preux

Recommender systems objectives can be broadly characterized as modeling user preferences over short-or long-term time horizon.

Dimensionality Reduction Sequential Recommendation +1

Operator-valued Kernels for Learning from Functional Response Data

no code implementations28 Oct 2015 Hachem Kadri, Emmanuel Duflos, Philippe Preux, Stéphane Canu, Alain Rakotomamonjy, Julien Audiffren

In this paper we consider the problems of supervised classification and regression in the case where attributes and labels are functions: a data is represented by a set of functions, and the label is also a function.

Audio Signal Processing General Classification

A Generative Model of Software Dependency Graphs to Better Understand Software Evolution

2 code implementations29 Oct 2014 Vincenzo Musco, Martin Monperrus, Philippe Preux

Then, we propose a generative model of software dependency graphs which synthesizes graphs whose degree distribution is close to the empirical ones observed in real software systems.

Software Engineering

Multiple Operator-valued Kernel Learning

no code implementations NeurIPS 2012 Hachem Kadri, Alain Rakotomamonjy, Philippe Preux, Francis R. Bach

We study this problem in the case of kernel ridge regression for functional responses with an lr-norm constraint on the combination coefficients.

regression

A Generalized Kernel Approach to Structured Output Learning

no code implementations10 May 2012 Hachem Kadri, Mohammad Ghavamzadeh, Philippe Preux

Finally, we evaluate the performance of our KDE approach using both covariance and conditional covariance kernels on two structured output problems, and compare it to the state-of-the-art kernel-based structured output regression methods.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.