Data efficiency is a key challenge for deep reinforcement learning.
Data efficiency poses a major challenge for deep reinforcement learning.
In this paper, we transform each view into a set of subviews and then decompose the original MI bound into a sum of bounds involving conditional MI between the subviews.
DeepInfoMax (DIM) is a self-supervised method which leverages the internal structure of deep networks to construct such views, forming prediction tasks between local features which depend on small patches in an image and global features which depend on the whole image.
We further improve performance by adding data augmentation to the future prediction loss, which forces the agent's representations to be consistent across multiple views of an observation.
Ranked #3 on Atari Games 100k on Atari 100k
We begin with the hypothesis that a model-free agent whose representations are predictive of properties of future states (beyond expected rewards) will be more capable of solving and adapting to new RL problems.
Following our proposed approach, we develop a model which learns image representations that significantly outperform prior methods on the tasks we consider.
Ranked #18 on Image Classification on STL-10
While recent progress has spawned very powerful machine learning systems, those agents remain extremely specialized and fail to transfer the knowledge they gain to similar yet unseen tasks.
Learning inter-domain mappings from unpaired data can improve performance in structured prediction tasks, such as image segmentation, by reducing the need for paired data.
In recent years, significant progress has been made in solving challenging problems across various domains using deep reinforcement learning (RL).
We develop an approach to training generative models based on unrolling a variational auto-encoder into a Markov chain, and shaping the chain's trajectories using a technique inspired by recent work in Approximate Bayesian computation.
We propose a recurrent neural model that generates natural-language questions from documents, conditioned on answers.
In this paper, we propose to equip Generative Adversarial Networks with the ability to produce direct energy estimates for samples. Specifically, we propose a flexible adversarial training framework, and prove this framework not only ensures the generator converges to the true data distribution, but also enables the discriminator to retain the density information at the global optimal.
Ranked #17 on Conditional Image Generation on CIFAR-10
We present an architecture which lets us train deep, directed generative models with many layers of latent variables.
We present NewsQA, a challenging machine comprehension dataset of over 100, 000 human-generated question-answer pairs.
Natural language generation plays a critical role in spoken dialogue systems.
We propose a novel neural attention architecture to tackle machine comprehension tasks, such as answering Cloze-style queries with respect to a document.
Ranked #4 on Question Answering on Children's Book Test (Accuracy-NE metric)
We formalize the notion of a pseudo-ensemble, a (possibly infinite) collection of child models spawned from a parent model by perturbing it according to some noise process.