1 code implementation • ICLR 2021 • Jacob Buckman, Carles Gelada, Marc G. Bellemare
To avoid this, algorithms can follow the pessimism principle, which states that we should choose the policy which acts optimally in the worst possible world.
no code implementations • 6 Jun 2019 • Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, Marc G. Bellemare
We show that the optimization of these objectives guarantees (1) the quality of the latent space as a representation of the state space and (2) the quality of the DeepMDP as a model of the environment.
13 code implementations • ICLR 2020 • Eleni Triantafillou, Tyler Zhu, Vincent Dumoulin, Pascal Lamblin, Utku Evci, Kelvin Xu, Ross Goroshin, Carles Gelada, Kevin Swersky, Pierre-Antoine Manzagol, Hugo Larochelle
Few-shot classification refers to learning a classifier for new classes given only a few examples.
Ranked #7 on Few-Shot Image Classification on Meta-Dataset Rank
1 code implementation • ICLR 2020 • William Fedus, Carles Gelada, Yoshua Bengio, Marc G. Bellemare, Hugo Larochelle
Reinforcement learning (RL) typically defines a discount factor as part of the Markov Decision Process.
no code implementations • 27 Jan 2019 • Carles Gelada, Marc G. Bellemare
We complement our analysis with an empirical evaluation of the two techniques in an off-policy setting on the game Pong from the Atari domain where we find discounted COP-TD to be better behaved in practice than the soft normalization penalty.
12 code implementations • 14 Dec 2018 • Pablo Samuel Castro, Subhodeep Moitra, Carles Gelada, Saurabh Kumar, Marc G. Bellemare
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.