CURL extracts high level features from raw pixels using a contrastive learning objective and performs off-policy control on top of the extracted features.
In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.
Ranked #43 on Atari Games on Atari 2600 Pong (using extra training data)
We present VideoGPT: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.
Self-attention models have recently been shown to have encouraging improvements on accuracy-parameter trade-offs compared to baseline convolutional models such as ResNet-50.
Ranked #200 on Image Classification on ImageNet
Using improved training and scaling strategies, we design a family of ResNet architectures, ResNet-RS, which are 1. 7x - 2. 7x faster than EfficientNets on TPUs, while achieving similar accuracies on ImageNet.
Ranked #1 on Document Image Classification on AIP
Recent advances in off-policy deep reinforcement learning (RL) have led to impressive success in complex tasks from visual observations.
Ranked #33 on Atari Games on Atari 2600 Amidar
Finally, we present a simple adaptation of the BoTNet design for image classification, resulting in models that achieve a strong performance of 84. 7% top-1 accuracy on the ImageNet benchmark while being up to 1. 64x faster in compute time than the popular EfficientNet models on TPU-v3 hardware.
Ranked #50 on Instance Segmentation on COCO minival
Temporal information is essential to learning effective policies with Reinforcement Learning (RL).
Furthermore, since our weighted Bellman backups rely on maintaining an ensemble, we investigate how weighted Bellman backups interact with other benefits previously derived from ensembles: (a) Bootstrap; (b) UCB Exploration.
We present VideoGen: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.
Attention mechanisms are generic inductive biases that have played a critical role in improving the state-of-the-art in supervised learning, unsupervised pre-training and generative modeling for multiple domains including vision, language and speech.
In this paper, we present Latent Vector Experience Replay (LeVER), a simple modification of existing off-policy RL methods, to address these computational and memory requirements without sacrificing the performance of RL agents.
Our baseline model outperforms the LVIS 2020 Challenge winning entry by +3. 6 mask AP on rare categories.
Ranked #1 on Object Detection on PASCAL VOC 2007
While improvements in deep learning architectures have played a crucial role in improving the state of supervised and unsupervised learning in computer vision and natural language processing, neural network architecture choices for reinforcement learning remain relatively under-explored.
A common practice in unsupervised representation learning is to use labeled data to evaluate the quality of the learned representations.
Off-policy deep reinforcement learning (RL) has been successful in a range of challenging domains.
To this end, we present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms.
On the DeepMind Control Suite, CURL is the first image-based algorithm to nearly match the sample-efficiency of methods that use state-based features.
Ranked #1 on Continuous Control on Finger, spin (DMControl500k)
In this paper, we propose a neural architecture for self-supervised representation learning on raw images called the PatchFormer which learns to model spatial dependencies across patches in a raw image.
Human observers can learn to recognize new categories of images from a handful of examples, yet doing so with artificial ones remains an open challenge.
Ranked #6 on Contrastive Learning on imagenet-1k
Flow-based generative models are powerful exact likelihood models with efficient sampling and inference.
Ranked #13 on Image Generation on ImageNet 32x32 (bpd metric)
A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization.
We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images.
Reinforcement Learning algorithms can learn complex behavioral patterns for sequential decision making tasks wherein an agent interacts with an environment and acquires feedback in the form of rewards sampled from it.
This paper introduces an automated skill acquisition framework in reinforcement learning which involves identifying a hierarchical description of the given task in terms of abstract states and extended actions between abstract states.
Deep Reinforcement Learning methods have achieved state of the art performance in learning control policies for the games in the Atari 2600 domain.
Second, the agent should be able to selectively transfer, which is the ability to select and transfer from different and multiple source tasks for different parts of the state space of the target task.