Search Results for author: Valentin Thomas

Found 10 papers, 1 papers with code

In-Context Data Distillation with TabPFN

no code implementations10 Feb 2024 Junwei Ma, Valentin Thomas, Guangwei Yu, Anthony Caterini

Foundation models have revolutionized tasks in computer vision and natural language processing.

In-Context Learning

The Role of Baselines in Policy Gradient Optimization

no code implementations16 Jan 2023 Jincheng Mei, Wesley Chung, Valentin Thomas, Bo Dai, Csaba Szepesvari, Dale Schuurmans

Instead, the analysis reveals that the primary effect of the value baseline is to \textbf{reduce the aggressiveness of the updates} rather than their variance.

Bridging the Gap Between Target Networks and Functional Regularization

no code implementations21 Oct 2022 Alexandre Piche, Valentin Thomas, Joseph Marino, Rafael Pardinas, Gian Maria Marconi, Christopher Pal, Mohammad Emtiyaz Khan

However, learning the value function via bootstrapping often leads to unstable training due to fast-changing target values.

Bridging the Gap Between Target Networks and Functional Regularization

1 code implementation4 Jun 2021 Alexandre Piché, Valentin Thomas, Rafael Pardinas, Joseph Marino, Gian Maria Marconi, Christopher Pal, Mohammad Emtiyaz Khan

Our findings emphasize that Functional Regularization can be used as a drop-in replacement for Target Networks and result in performance improvement.

Q-Learning

Beyond variance reduction: Understanding the true impact of baselines on policy optimization

no code implementations31 Aug 2020 Wesley Chung, Valentin Thomas, Marlos C. Machado, Nicolas Le Roux

Traditionally, stochastic optimization theory predicts that learning dynamics are governed by the curvature of the loss function and the noise of the gradient estimates.

Reinforcement Learning (RL) Stochastic Optimization

On the interplay between noise and curvature and its effect on optimization and generalization

no code implementations18 Jun 2019 Valentin Thomas, Fabian Pedregosa, Bart van Merriënboer, Pierre-Antoine Mangazol, Yoshua Bengio, Nicolas Le Roux

The speed at which one can minimize an expected loss using stochastic methods depends on two properties: the curvature of the loss and the variance of the gradients.

Probabilistic Planning with Sequential Monte Carlo methods

no code implementations ICLR 2019 Alexandre Piche, Valentin Thomas, Cyril Ibrahim, Yoshua Bengio, Chris Pal

In this work, we propose a novel formulation of planning which views it as a probabilistic inference problem over future optimal trajectories.

Continuous Control

Independently Controllable Factors

no code implementations3 Aug 2017 Valentin Thomas, Jules Pondard, Emmanuel Bengio, Marc Sarfati, Philippe Beaudoin, Marie-Jean Meurs, Joelle Pineau, Doina Precup, Yoshua Bengio

It has been postulated that a good representation is one that disentangles the underlying explanatory factors of variation.

Open-Ended Question Answering

Independently Controllable Features

no code implementations22 Mar 2017 Emmanuel Bengio, Valentin Thomas, Joelle Pineau, Doina Precup, Yoshua Bengio

Finding features that disentangle the different causes of variation in real data is a difficult task, that has nonetheless received considerable attention in static domains like natural images.

Cannot find the paper you are looking for? You can Submit a new open access paper.