1 code implementation • 14 Nov 2022 • Adam Khakhar, Jacob Buckman
In this work, we demonstrate that a major limitation of regression using a mean-squared error loss is its sensitivity to the scale of its targets.
1 code implementation • 2 Jun 2022 • David Brandfonbrener, Alberto Bietti, Jacob Buckman, Romain Laroche, Joan Bruna
Several recent works have proposed a class of algorithms for the offline reinforcement learning (RL) problem that we will refer to as return-conditioned supervised learning (RCSL).
no code implementations • 27 May 2022 • Romain Laroche, Remi Tachet des Combes, Jacob Buckman
A central object of study in Reinforcement Learning (RL) is the Markovian policy, in which an agent's actions are chosen from a memoryless probability distribution, conditioned only on its current state.
1 code implementation • ICLR 2021 • Jacob Buckman, Carles Gelada, Marc G. Bellemare
To avoid this, algorithms can follow the pessimism principle, which states that we should choose the policy which acts optimally in the worst possible world.
no code implementations • 6 Jun 2019 • Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, Marc G. Bellemare
We show that the optimization of these objectives guarantees (1) the quality of the latent space as a representation of the state space and (2) the quality of the DeepMDP as a model of the environment.
2 code implementations • NeurIPS 2018 • Jacob Buckman, Danijar Hafner, George Tucker, Eugene Brevdo, Honglak Lee
Integrating model-free and model-based approaches in reinforcement learning has the potential to achieve the high performance of model-free algorithms with low sample complexity.
1 code implementation • TACL 2018 • Jacob Buckman, Graham Neubig
In this work, we propose a new language modeling paradigm that has the ability to perform both prediction and moderation of information flow at multiple granularities: neural lattice language models.
no code implementations • ICML 2018 • Augustus Odena, Jacob Buckman, Catherine Olsson, Tom B. Brown, Christopher Olah, Colin Raffel, Ian Goodfellow
Motivated by this, we study the distribution of singular values of the Jacobian of the generator in Generative Adversarial Networks (GANs).
no code implementations • ICLR 2018 • Jacob Buckman, Aurko Roy, Colin Raffel, Ian Goodfellow
It is well known that it is possible to construct "adversarial examples" for neural networks: inputs which are misclassified by the network yet indistinguishable from true data.