Search Results for author: Agrim Gupta

Found 9 papers, 5 papers with code

VIMA: General Robot Manipulation with Multimodal Prompts

no code implementations6 Oct 2022 Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, Linxi Fan

This work shows that we can express a wide spectrum of robot manipulation tasks with multimodal prompts, interleaving textual and visual tokens.

Imitation Learning Language Modelling +1

MaskViT: Masked Visual Pre-Training for Video Prediction

no code implementations23 Jun 2022 Agrim Gupta, Stephen Tian, Yunzhi Zhang, Jiajun Wu, Roberto Martín-Martín, Li Fei-Fei

This work shows that we can create good video prediction models by pre-training transformers via masked visual modeling.

Video Prediction

MetaMorph: Learning Universal Controllers with Transformers

1 code implementation ICLR 2022 Agrim Gupta, Linxi Fan, Surya Ganguli, Li Fei-Fei

Multiple domains like vision, natural language, and audio are witnessing tremendous progress by leveraging Transformers for large scale pre-training followed by task specific fine tuning.

Embodied Intelligence via Learning and Evolution

1 code implementation3 Feb 2021 Agrim Gupta, Silvio Savarese, Surya Ganguli, Li Fei-Fei

However, the principles governing relations between environmental complexity, evolved morphology, and the learnability of intelligent control, remain elusive, partially due to the substantial challenge of performing large-scale in silico experiments on evolution and learning.

LVIS: A Dataset for Large Vocabulary Instance Segmentation

3 code implementations CVPR 2019 Agrim Gupta, Piotr Dollár, Ross Girshick

We plan to collect ~2 million high-quality instance segmentation masks for over 1000 entry-level object categories in 164k images.

Instance Segmentation object-detection +2

Image Generation from Scene Graphs

4 code implementations CVPR 2018 Justin Johnson, Agrim Gupta, Li Fei-Fei

To overcome this limitation we propose a method for generating images from scene graphs, enabling explicitly reasoning about objects and their relationships.

Image Generation from Scene Graphs Layout-to-Image Generation

Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks

7 code implementations CVPR 2018 Agrim Gupta, Justin Johnson, Li Fei-Fei, Silvio Savarese, Alexandre Alahi

Understanding human motion behavior is critical for autonomous moving platforms (like self-driving cars and social robots) if they are to navigate human-centric environments.

Motion Forecasting Multi-future Trajectory Prediction +3

Cannot find the paper you are looking for? You can Submit a new open access paper.