Search Results for author: Philippe Preux

Found 23 papers, 5 papers with code

Better state exploration using action sequence equivalence

no code implementations29 Sep 2021 Nathan Grinsztajn, Toby Johnstone, Johan Ferret, Philippe Preux

Incorporating prior knowledge in reinforcement learning algorithms is mainly an open question.


Low-Rank Projections of GCNs Laplacian

no code implementations ICLR Workshop GTRL 2021 Nathan Grinsztajn, Philippe Preux, Edouard Oyallon

In this work, we study the behavior of standard models for community detection under spectral manipulations.

Community Detection

Interferometric Graph Transform for Community Labeling

no code implementations4 Jun 2021 Nathan Grinsztajn, Louis Leconte, Philippe Preux, Edouard Oyallon

We present a new approach for learning unsupervised node representations in community graphs.

Don't Do What Doesn't Matter: Intrinsic Motivation with Action Usefulness

1 code implementation20 May 2021 Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

Sparse rewards are double-edged training signals in reinforcement learning: easy to design but hard to optimize.


Adversarially Guided Actor-Critic

1 code implementation ICLR 2021 Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist

Despite definite success in deep reinforcement learning problems, actor-critic algorithms are still confronted with sample inefficiency in complex environments, particularly in tasks where efficient exploration is a bottleneck.

Efficient Exploration

A spectral perspective on GCNs

no code implementations1 Jan 2021 Nathan Grinsztajn, Philippe Preux, Edouard Oyallon

In this work, we study the behavior of standard GCNs under spectral manipulations.

Geometric Deep Reinforcement Learning for Dynamic DAG Scheduling

1 code implementation9 Nov 2020 Nathan Grinsztajn, Olivier Beaumont, Emmanuel Jeannot, Philippe Preux

In this paper, we propose a reinforcement learning approach to solve a realistic scheduling problem, and apply it to an algorithm commonly executed in the high performance computing community, the Cholesky factorization.

Combinatorial Optimization reinforcement-learning

Learning Value Functions in Deep Policy Gradients using Residual Variance

no code implementations ICLR 2021 Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, Philippe Preux

We prove the theoretical consistency of the new gradient estimator and observe dramatic empirical improvement across a variety of continuous control tasks and algorithms.

Continuous Control Decision Making

A Machine of Few Words -- Interactive Speaker Recognition with Reinforcement Learning

no code implementations7 Aug 2020 Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

To do so, we cast the speaker recognition task into a sequential decision-making problem that we solve with Reinforcement Learning.

Decision Making reinforcement-learning +2

I'm sorry Dave, I'm afraid I can't do that, Deep Q-learning from forbidden action

no code implementations4 Oct 2019 Mathieu Seurin, Philippe Preux, Olivier Pietquin

Violating constraints thus results in rejected actions or entering in a safe mode driven by an external controller, making RL agents incapable of learning from their mistakes.

Industrial Robots Q-Learning

MERL: Multi-Head Reinforcement Learning

no code implementations26 Sep 2019 Yannis Flet-Berliac, Philippe Preux

In this paper: (a) We introduce and define MERL, the multi-head reinforcement learning framework we use throughout this work.

Continuous Control reinforcement-learning +1

Samples Are Useful? Not Always: denoising policy gradient updates using variance explained

no code implementations25 Sep 2019 Yannis Flet-Berliac, Philippe Preux

In this work, Vex is used to evaluate the impact each transition will have on learning: this criterion refines sampling and improves the policy gradient algorithm.

Continuous Control Denoising

Only Relevant Information Matters: Filtering Out Noisy Samples to Boost RL

no code implementations8 Apr 2019 Yannis Flet-Berliac, Philippe Preux

In this work, we use this metric to select samples that are useful to learn from, and we demonstrate that this selection can significantly improve the performance of policy gradient methods.

Continuous Control Denoising +1

Visual Reasoning with Multi-hop Feature Modulation

1 code implementation ECCV 2018 Florian Strub, Mathieu Seurin, Ethan Perez, Harm de Vries, Jérémie Mary, Philippe Preux, Aaron Courville, Olivier Pietquin

Recent breakthroughs in computer vision and natural language processing have spurred interest in challenging multi-modal tasks such as visual question-answering and visual dialogue.

Question Answering Visual Dialog +2

Recurrent Neural Networks for Long and Short-Term Sequential Recommendation

no code implementations23 Jul 2018 Kiewan Villatel, Elena Smirnova, Jérémie Mary, Philippe Preux

Recommender systems objectives can be broadly characterized as modeling user preferences over short-or long-term time horizon.

Dimensionality Reduction Sequential Recommendation +1

Operator-valued Kernels for Learning from Functional Response Data

no code implementations28 Oct 2015 Hachem Kadri, Emmanuel Duflos, Philippe Preux, Stéphane Canu, Alain Rakotomamonjy, Julien Audiffren

In this paper we consider the problems of supervised classification and regression in the case where attributes and labels are functions: a data is represented by a set of functions, and the label is also a function.

Audio Signal Processing General Classification

A Generative Model of Software Dependency Graphs to Better Understand Software Evolution

2 code implementations29 Oct 2014 Vincenzo Musco, Martin Monperrus, Philippe Preux

Then, we propose a generative model of software dependency graphs which synthesizes graphs whose degree distribution is close to the empirical ones observed in real software systems.

Software Engineering

Multiple Operator-valued Kernel Learning

no code implementations NeurIPS 2012 Hachem Kadri, Alain Rakotomamonjy, Philippe Preux, Francis R. Bach

We study this problem in the case of kernel ridge regression for functional responses with an lr-norm constraint on the combination coefficients.

A Generalized Kernel Approach to Structured Output Learning

no code implementations10 May 2012 Hachem Kadri, Mohammad Ghavamzadeh, Philippe Preux

Finally, we evaluate the performance of our KDE approach using both covariance and conditional covariance kernels on two structured output problems, and compare it to the state-of-the-art kernel-based structured output regression methods.

Cannot find the paper you are looking for? You can Submit a new open access paper.