Search Results for author: Austin Stone

Found 14 papers, 8 papers with code

Teaching Compositionality to CNNs

no code implementations • CVPR 2017 • Austin Stone, Huayan Wang, Michael Stark, Yi Liu, D. Scott Phoenix, Dileep George

Convolutional neural networks (CNNs) have shown great success in computer vision, approaching human-level performance when trained for specific tasks via application-specific loss functions.

Object Recognition

Paper
Add Code

Towards Object Detection from Motion

no code implementations • 17 Sep 2019 • Rico Jonschkowski, Austin Stone

We present a novel approach to weakly supervised object detection.

Object object-detection +1

Paper
Add Code

What Matters in Unsupervised Optical Flow

5 code implementations • ECCV 2020 • Rico Jonschkowski, Austin Stone, Jonathan T. Barron, Ariel Gordon, Kurt Konolige, Anelia Angelova

We systematically compare and analyze a set of key components in unsupervised optical flow to identify which photometric loss, occlusion handling, and smoothness regularization is most effective.

Ranked #5 on Optical Flow Estimation on Sintel Clean unsupervised

Occlusion Handling Optical Flow Estimation

32,743

Paper
Code

The Distracting Control Suite -- A Challenging Benchmark for Reinforcement Learning from Pixels

3 code implementations • 7 Jan 2021 • Austin Stone, Oscar Ramirez, Kurt Konolige, Rico Jonschkowski

Our experiments show that current RL methods for vision-based control perform poorly under distractions, and that their performance decreases with increasing distraction complexity, showing that new methods are needed to cope with the visual complexities of the real world.

reinforcement-learning Reinforcement Learning (RL)

32,736

Paper
Code

SMURF: Self-Teaching Multi-Frame Unsupervised RAFT with Full-Image Warping

2 code implementations • CVPR 2021 • Austin Stone, Daniel Maurer, Alper Ayvaci, Anelia Angelova, Rico Jonschkowski

We present SMURF, a method for unsupervised learning of optical flow that improves state of the art on all benchmarks by $36\%$ to $40\%$ (over the prior best method UFlow) and even outperforms several supervised approaches such as PWC-Net and FlowNet2.

Optical Flow Estimation

32,736

Paper
Code

Conditional Object-Centric Learning from Video

3 code implementations • ICLR 2022 • Thomas Kipf, Gamaleldin F. Elsayed, Aravindh Mahendran, Austin Stone, Sara Sabour, Georg Heigold, Rico Jonschkowski, Alexey Dosovitskiy, Klaus Greff

Object-centric representations are a promising path toward more systematic generalization by providing flexible abstractions upon which compositional world models can be built.

Instance Segmentation Object +3

138

Paper
Code

Kubric: A scalable dataset generator

1 code implementation • CVPR 2022 • Klaus Greff, Francois Belletti, Lucas Beyer, Carl Doersch, Yilun Du, Daniel Duckworth, David J. Fleet, Dan Gnanapragasam, Florian Golemo, Charles Herrmann, Thomas Kipf, Abhijit Kundu, Dmitry Lagun, Issam Laradji, Hsueh-Ti, Liu, Henning Meyer, Yishu Miao, Derek Nowrouzezahrai, Cengiz Oztireli, Etienne Pot, Noha Radwan, Daniel Rebain, Sara Sabour, Mehdi S. M. Sajjadi, Matan Sela, Vincent Sitzmann, Austin Stone, Deqing Sun, Suhani Vora, Ziyu Wang, Tianhao Wu, Kwang Moo Yi, Fangcheng Zhong, Andrea Tagliasacchi

Data is the driving force of machine learning, with the amount and quality of training data often being more important for the performance of a system than architecture and training details.

Fairness Optical Flow Estimation

2,169

Paper
Code

Simple Open-Vocabulary Object Detection with Vision Transformers

2 code implementations • 12 May 2022 • Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby

Combining simple architectures with large-scale pre-training has led to massive improvements in image classification.

Ranked #1 on One-Shot Object Detection on MS COCO

Described Object Detection Image Classification +3

2,982

Paper
Code

Open-vocabulary Queryable Scene Representations for Real World Planning

no code implementations • 20 Sep 2022 • Boyuan Chen, Fei Xia, Brian Ichter, Kanishka Rao, Keerthana Gopalakrishnan, Michael S. Ryoo, Austin Stone, Daniel Kappler

Large language models (LLMs) have unlocked new capabilities of task planning from human instructions.

Paper
Add Code

Token Turing Machines

1 code implementation • CVPR 2023 • Michael S. Ryoo, Keerthana Gopalakrishnan, Kumara Kahatapitiya, Ted Xiao, Kanishka Rao, Austin Stone, Yao Lu, Julian Ibarz, Anurag Arnab

The model's memory module ensures that a new observation will only be processed with the contents of the memory (and not the entire history), meaning that it can efficiently process long sequences with a bounded computational cost at each step.

Ranked #1 on Action Detection on Charades

Action Detection Activity Detection

2,983

Paper
Code

Self-supervised AutoFlow

no code implementations • CVPR 2023 • Hsin-Ping Huang, Charles Herrmann, Junhwa Hur, Erika Lu, Kyle Sargent, Austin Stone, Ming-Hsuan Yang, Deqing Sun

Recently, AutoFlow has shown promising results on learning a training set for optical flow, but requires ground truth labels in the target domain to compute its search metric.

Optical Flow Estimation

Paper
Add Code

RT-1: Robotics Transformer for Real-World Control at Scale

1 code implementation • 13 Dec 2022 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance.

1,177

Paper
Code

Scaling Robot Learning with Semantically Imagined Experience

no code implementations • 22 Feb 2023 • Tianhe Yu, Ted Xiao, Austin Stone, Jonathan Tompson, Anthony Brohan, Su Wang, Jaspiar Singh, Clayton Tan, Dee M, Jodilyn Peralta, Brian Ichter, Karol Hausman, Fei Xia

Specifically, we make use of the state of the art text-to-image diffusion models and perform aggressive data augmentation on top of our existing robotic manipulation datasets via inpainting various unseen objects for manipulation, backgrounds, and distractors with text guidance.

Data Augmentation

Paper
Add Code

Open-World Object Manipulation using Pre-trained Vision-Language Models

no code implementations • 2 Mar 2023 • Austin Stone, Ted Xiao, Yao Lu, Keerthana Gopalakrishnan, Kuang-Huei Lee, Quan Vuong, Paul Wohlhart, Sean Kirmani, Brianna Zitkovich, Fei Xia, Chelsea Finn, Karol Hausman

This brings up a notably difficult challenge for robots: while robot learning approaches allow robots to learn many different behaviors from first-hand experience, it is impractical for robots to have first-hand experiences that span all of this semantic information.

Language Modelling Object

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.