Search Results for author: Michel Ma

Found 3 papers, 2 papers with code

Do Transformer World Models Give Better Policy Gradients?

no code implementations • 7 Feb 2024 • Michel Ma, Tianwei Ni, Clement Gehring, Pierluca D'Oro, Pierre-Luc Bacon

We integrate such AWMs into a policy gradient framework that underscores the relationship between network architectures and the policy gradient updates they inherently represent.

Navigate

Paper
Add Code

Bridging State and History Representations: Understanding Self-Predictive RL

1 code implementation • 17 Jan 2024 • Tianwei Ni, Benjamin Eysenbach, Erfan Seyedsalehi, Michel Ma, Clement Gehring, Aditya Mahajan, Pierre-Luc Bacon

These findings culminate in a set of preliminary guidelines for RL practitioners.

Reinforcement Learning (RL) Representation Learning

Paper
Code

When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment

2 code implementations • NeurIPS 2023 • Tianwei Ni, Michel Ma, Benjamin Eysenbach, Pierre-Luc Bacon

The Transformer architecture has been very successful to solve problems that involve long-term dependencies, including in the RL domain.

Reinforcement Learning (RL)

277

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.