1 code implementation • ICML 2020 • Michael Laskin, Pieter Abbeel, Aravind Srinivas
CURL extracts high level features from raw pixels using a contrastive learning objective and performs off-policy control on top of the extracted features.
16 code implementations • NeurIPS 2021 • Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch
In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.
Ranked #43 on
Atari Games
on Atari 2600 Pong
(using extra training data)
1 code implementation • 20 Apr 2021 • Wilson Yan, Yunzhi Zhang, Pieter Abbeel, Aravind Srinivas
We present VideoGPT: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.
7 code implementations • CVPR 2021 • Ashish Vaswani, Prajit Ramachandran, Aravind Srinivas, Niki Parmar, Blake Hechtman, Jonathon Shlens
Self-attention models have recently been shown to have encouraging improvements on accuracy-parameter trade-offs compared to baseline convolutional models such as ResNet-50.
Ranked #200 on
Image Classification
on ImageNet
3 code implementations • NeurIPS 2021 • Irwan Bello, William Fedus, Xianzhi Du, Ekin D. Cubuk, Aravind Srinivas, Tsung-Yi Lin, Jonathon Shlens, Barret Zoph
Using improved training and scaling strategies, we design a family of ResNet architectures, ResNet-RS, which are 1. 7x - 2. 7x faster than EfficientNets on TPUs, while achieving similar accuracies on ImageNet.
Ranked #1 on
Document Image Classification
on AIP
1 code implementation • NeurIPS 2021 • Lili Chen, Kimin Lee, Aravind Srinivas, Pieter Abbeel
Recent advances in off-policy deep reinforcement learning (RL) have led to impressive success in complex tasks from visual observations.
Ranked #33 on
Atari Games
on Atari 2600 Amidar
13 code implementations • CVPR 2021 • Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani
Finally, we present a simple adaptation of the BoTNet design for image classification, resulting in models that achieve a strong performance of 84. 7% top-1 accuracy on the ImageNet benchmark while being up to 1. 64x faster in compute time than the popular EfficientNet models on TPU-v3 hardware.
Ranked #50 on
Instance Segmentation
on COCO minival
2 code implementations • NeurIPS 2021 • Wenling Shang, Xiaofei Wang, Aravind Srinivas, Aravind Rajeswaran, Yang Gao, Pieter Abbeel, Michael Laskin
Temporal information is essential to learning effective policies with Reinforcement Learning (RL).
no code implementations • 1 Jan 2021 • Kimin Lee, Michael Laskin, Aravind Srinivas, Pieter Abbeel
Furthermore, since our weighted Bellman backups rely on maintaining an ensemble, we investigate how weighted Bellman backups interact with other benefits previously derived from ensembles: (a) Bootstrap; (b) UCB Exploration.
no code implementations • 1 Jan 2021 • Yunzhi Zhang, Wilson Yan, Pieter Abbeel, Aravind Srinivas
We present VideoGen: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.
no code implementations • 1 Jan 2021 • Mandi Zhao, Qiyang Li, Aravind Srinivas, Ignasi Clavera, Kimin Lee, Pieter Abbeel
Attention mechanisms are generic inductive biases that have played a critical role in improving the state-of-the-art in supervised learning, unsupervised pre-training and generative modeling for multiple domains including vision, language and speech.
no code implementations • 1 Jan 2021 • Lili Chen, Kimin Lee, Aravind Srinivas, Pieter Abbeel
In this paper, we present Latent Vector Experience Replay (LeVER), a simple modification of existing off-policy RL methods, to address these computational and memory requirements without sacrificing the performance of RL agents.
5 code implementations • CVPR 2021 • Golnaz Ghiasi, Yin Cui, Aravind Srinivas, Rui Qian, Tsung-Yi Lin, Ekin D. Cubuk, Quoc V. Le, Barret Zoph
Our baseline model outperforms the LVIS 2020 Challenge winning entry by +3. 6 mask AP on rare categories.
Ranked #1 on
Object Detection
on PASCAL VOC 2007
4 code implementations • 19 Oct 2020 • Samarth Sinha, Homanga Bharadhwaj, Aravind Srinivas, Animesh Garg
While improvements in deep learning architectures have played a crucial role in improving the state of supervised and unsupervised learning in computer vision and natural language processing, neural network architecture choices for reinforcement learning remain relatively under-explored.
1 code implementation • CVPR 2021 • Colorado J Reed, Sean Metzger, Aravind Srinivas, Trevor Darrell, Kurt Keutzer
A common practice in unsupervised representation learning is to use labeled data to evaluate the quality of the learned representations.
1 code implementation • 9 Jul 2020 • Kimin Lee, Michael Laskin, Aravind Srinivas, Pieter Abbeel
Off-policy deep reinforcement learning (RL) has been successful in a range of challenging domains.
2 code implementations • NeurIPS 2020 • Michael Laskin, Kimin Lee, Adam Stooke, Lerrel Pinto, Pieter Abbeel, Aravind Srinivas
To this end, we present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms.
7 code implementations • 8 Apr 2020 • Aravind Srinivas, Michael Laskin, Pieter Abbeel
On the DeepMind Control Suite, CURL is the first image-based algorithm to nearly match the sample-efficiency of methods that use state-based features.
Ranked #1 on
Continuous Control
on Finger, spin (DMControl500k)
no code implementations • 25 Sep 2019 • Aravind Srinivas, Pieter Abbeel
In this paper, we propose a neural architecture for self-supervised representation learning on raw images called the PatchFormer which learns to model spatial dependencies across patches in a raw image.
4 code implementations • ICML 2020 • Olivier J. Hénaff, Aravind Srinivas, Jeffrey De Fauw, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord
Human observers can learn to recognize new categories of images from a handful of examples, yet doing so with artificial ones remains an open challenge.
Ranked #6 on
Contrastive Learning
on imagenet-1k
4 code implementations • ICLR 2019 • Jonathan Ho, Xi Chen, Aravind Srinivas, Yan Duan, Pieter Abbeel
Flow-based generative models are powerful exact likelihood models with efficient sampling and inference.
Ranked #13 on
Image Generation
on ImageNet 32x32
(bpd metric)
1 code implementation • ICML 2018 • Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn
A key challenge in complex visuomotor control is learning abstract representations that are effective for specifying goals, planning, and generalization.
1 code implementation • 2 Apr 2018 • Aravind Srinivas, Allan Jabri, Pieter Abbeel, Sergey Levine, Chelsea Finn
We find that the representations learned are not only effective for goal-directed visual imitation via gradient-based trajectory optimization, but can also provide a metric for specifying goals using images.
no code implementations • 20 Feb 2017 • Sahil Sharma, Aravind Srinivas, Balaraman Ravindran
Reinforcement Learning algorithms can learn complex behavioral patterns for sequential decision making tasks wherein an agent interacts with an environment and acquires feedback in the form of rewards sampled from it.
no code implementations • 17 May 2016 • Aravind Srinivas, Ramnandan Krishnamurthy, Peeyush Kumar, Balaraman Ravindran
This paper introduces an automated skill acquisition framework in reinforcement learning which involves identifying a hierarchical description of the given task in terms of abstract states and extended actions between abstract states.
no code implementations • 17 May 2016 • Aravind Srinivas, Sahil Sharma, Balaraman Ravindran
Deep Reinforcement Learning methods have achieved state of the art performance in learning control policies for the games in the Atari 2600 domain.
2 code implementations • 10 Oct 2015 • Janarthanan Rajendran, Aravind Srinivas, Mitesh M. Khapra, P. Prasanna, Balaraman Ravindran
Second, the agent should be able to selectively transfer, which is the ability to select and transfer from different and multiple source tasks for different parts of the state space of the target task.