2 code implementations • 29 Oct 2019 • Yilin Wu, Wilson Yan, Thanard Kurutach, Lerrel Pinto, Pieter Abbeel
Second, instead of jointly learning both the pick and the place locations, we only explicitly learn the placing policy conditioned on random pick points.
no code implementations • 25 Nov 2019 • Wilson Yan, Jonathan Ho, Pieter Abbeel
Deep autoregressive models are one of the most powerful models that exist today which achieve state-of-the-art bits per dim.
1 code implementation • 11 Mar 2020 • Wilson Yan, Ashwin Vangipuram, Pieter Abbeel, Lerrel Pinto
Using visual model-based learning for deformable object manipulation is challenging due to difficulties in learning plannable visual representations along with complex dynamic models.
no code implementations • 1 Jan 2021 • Yunzhi Zhang, Wilson Yan, Pieter Abbeel, Aravind Srinivas
We present VideoGen: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.
3 code implementations • 20 Apr 2021 • Wilson Yan, Yunzhi Zhang, Pieter Abbeel, Aravind Srinivas
We present VideoGPT: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.
1 code implementation • 8 Jun 2022 • Wilson Yan, Ryo Okumura, Stephen James, Pieter Abbeel
In this work, we present Patch-based Object-centric Video Transformer (POVT), a novel region-based video generation architecture that leverages object-centric information to efficiently model temporal dynamics in videos.
1 code implementation • 5 Oct 2022 • Wilson Yan, Danijar Hafner, Stephen James, Pieter Abbeel
To generate accurate videos, algorithms have to understand the spatial and temporal dependencies in the world.
1 code implementation • NeurIPS 2023 • Hao liu, Wilson Yan, Pieter Abbeel
Recent progress in scaling up large language models has shown impressive capabilities in performing few-shot learning across a wide range of text-based tasks.
no code implementations • 16 Jun 2023 • Xinran Liang, Anthony Han, Wilson Yan, aditi raghunathan, Pieter Abbeel
In addition, we show that by training on actively collected data more relevant to the environment and task, our method generalizes more robustly to downstream tasks compared to models pre-trained on fixed datasets such as ImageNet.
no code implementations • 30 Nov 2023 • Wilson Yan, Andrew Brown, Pieter Abbeel, Rohit Girdhar, Samaneh Azadi
We introduce MoCA, a Motion-Conditioned Image Animation approach for video editing.
1 code implementation • 13 Feb 2024 • Hao liu, Wilson Yan, Matei Zaharia, Pieter Abbeel
To address these challenges, we curate a large dataset of diverse videos and books, utilize the Blockwise RingAttention technique to scalably train on long sequences, and gradually increase context size from 4K to 1M tokens.