Search Results for author: Corentin Tallec

Found 20 papers, 10 papers with code

Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments

no code implementations18 Nov 2022 Daniel Jarrett, Corentin Tallec, Florent Altché, Thomas Mesnard, Rémi Munos, Michal Valko

In this work, we study a natural solution derived from structural causal models of the world: Our key idea is to learn representations of the future that capture precisely the unpredictable aspects of each outcome -- which we use as additional input for predictions, such that intrinsic rewards only reflect the predictable aspects of world dynamics.

Montezuma's Revenge

Emergent Communication: Generalization and Overfitting in Lewis Games

1 code implementation30 Sep 2022 Mathieu Rita, Corentin Tallec, Paul Michel, Jean-bastien Grill, Olivier Pietquin, Emmanuel Dupoux, Florian Strub

Lewis signaling games are a class of simple communication games for simulating the emergence of language.

Large-Scale Representation Learning on Graphs via Bootstrapping

4 code implementations ICLR 2022 Shantanu Thakoor, Corentin Tallec, Mohammad Gheshlaghi Azar, Mehdi Azabou, Eva L. Dyer, Rémi Munos, Petar Veličković, Michal Valko

To address these challenges, we introduce Bootstrapped Graph Latents (BGRL) - a graph representation learning method that learns by predicting alternative augmentations of the input.

Contrastive Learning Graph Representation Learning +1

Learning Successor States and Goal-Dependent Values: A Mathematical Viewpoint

no code implementations18 Jan 2021 Léonard Blier, Corentin Tallec, Yann Ollivier

In reinforcement learning, temporal difference-based algorithms can be sample-inefficient: for instance, with sparse rewards, no learning occurs until a reward is observed.

Making Deep Q-learning methods robust to time discretization

1 code implementation28 Jan 2019 Corentin Tallec, Léonard Blier, Yann Ollivier

Despite remarkable successes, Deep Reinforcement Learning (DRL) is not robust to hyperparameterization, implementation details, or small environment changes (Henderson et al. 2017, Zhang et al. 2018).

Q-Learning

Mixed batches and symmetric discriminators for GAN training

no code implementations ICML 2018 Thomas Lucas, Corentin Tallec, Jakob Verbeek, Yann Ollivier

We propose to feed the discriminator with mixed batches of true and fake samples, and train it to predict the ratio of true samples in the batch.

Can recurrent neural networks warp time?

1 code implementation ICLR 2018 Corentin Tallec, Yann Ollivier

Successful recurrent models such as long short-term memories (LSTMs) and gated recurrent units (GRUs) use ad hoc gating mechanisms.

Unbiasing Truncated Backpropagation Through Time

no code implementations ICLR 2018 Corentin Tallec, Yann Ollivier

Truncated BPTT keeps the computational benefits of Backpropagation Through Time (BPTT) while relieving the need for a complete backtrack through the whole data sequence at every step.

Language Modelling

Unbiased Online Recurrent Optimization

1 code implementation ICLR 2018 Corentin Tallec, Yann Ollivier

The novel Unbiased Online Recurrent Optimization (UORO) algorithm allows for online learning of general recurrent computational graphs such as recurrent network models.

Training recurrent networks online without backtracking

no code implementations28 Jul 2015 Yann Ollivier, Corentin Tallec, Guillaume Charpiat

The evolution of this search direction is partly stochastic and is constructed in such a way to provide, at every time, an unbiased random estimate of the gradient of the loss function with respect to the parameters.

Cannot find the paper you are looking for? You can Submit a new open access paper.