Search Results for author: Jonathan Tompson

Found 39 papers, 16 papers with code

FlexCap: Generating Rich, Localized, and Flexible Captions in Images

no code implementations18 Mar 2024 Debidatta Dwibedi, Vidhi Jain, Jonathan Tompson, Andrew Zisserman, Yusuf Aytar

The model, FlexCap, is trained to produce length-conditioned captions for input bounding boxes, and this allows control over the information density of its output, with descriptions ranging from concise object labels to detailed captions.

Attribute Dense Captioning +8

RT-H: Action Hierarchies Using Language

no code implementations4 Mar 2024 Suneel Belkhale, Tianli Ding, Ted Xiao, Pierre Sermanet, Quon Vuong, Jonathan Tompson, Yevgen Chebotar, Debidatta Dwibedi, Dorsa Sadigh

Predicting these language motions as an intermediate step between tasks and actions forces the policy to learn the shared structure of low-level motions across seemingly disparate tasks.

Imitation Learning

Video Language Planning

no code implementations16 Oct 2023 Yilun Du, Mengjiao Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, Tianhe Yu, Pieter Abbeel, Joshua B. Tenenbaum, Leslie Kaelbling, Andy Zeng, Jonathan Tompson

We are interested in enabling visual planning for complex long-horizon tasks in the space of generated videos and language, leveraging recent advances in large generative models pretrained on Internet-scale data.

Learning Interactive Real-World Simulators

no code implementations9 Oct 2023 Mengjiao Yang, Yilun Du, Kamyar Ghasemipour, Jonathan Tompson, Leslie Kaelbling, Dale Schuurmans, Pieter Abbeel

Applications of a real-world simulator range from controllable content creation in games and movies, to training embodied agents purely in simulation that can be directly deployed in the real world.

Video Captioning

Scaling Robot Learning with Semantically Imagined Experience

no code implementations22 Feb 2023 Tianhe Yu, Ted Xiao, Austin Stone, Jonathan Tompson, Anthony Brohan, Su Wang, Jaspiar Singh, Clayton Tan, Dee M, Jodilyn Peralta, Brian Ichter, Karol Hausman, Fei Xia

Specifically, we make use of the state of the art text-to-image diffusion models and perform aggressive data augmentation on top of our existing robotic manipulation datasets via inpainting various unseen objects for manipulation, backgrounds, and distractors with text guidance.

Data Augmentation

Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models

no code implementations21 Nov 2022 Ted Xiao, Harris Chan, Pierre Sermanet, Ayzaan Wahid, Anthony Brohan, Karol Hausman, Sergey Levine, Jonathan Tompson

To accomplish this, we introduce Data-driven Instruction Augmentation for Language-conditioned control (DIAL): we utilize semi-supervised language labels leveraging the semantic understanding of CLIP to propagate knowledge onto large datasets of unlabelled demonstration data and then train language-conditioned policies on the augmented datasets.

Imitation Learning

Contrastive Value Learning: Implicit Models for Simple Offline RL

no code implementations3 Nov 2022 Bogdan Mazoure, Benjamin Eysenbach, Ofir Nachum, Jonathan Tompson

In this paper, we propose Contrastive Value Learning (CVL), which learns an implicit, multi-step model of the environment dynamics.

Continuous Control Model-based Reinforcement Learning +2

Interactive Language: Talking to Robots in Real Time

1 code implementation12 Oct 2022 Corey Lynch, Ayzaan Wahid, Jonathan Tompson, Tianli Ding, James Betker, Robert Baruch, Travis Armstrong, Pete Florence

We present a framework for building interactive, real-time, natural language-instructable robots in the real world, and we open source related assets (dataset, environment, benchmark, and policies).

Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations

no code implementations12 May 2022 Negin Heravi, Ayzaan Wahid, Corey Lynch, Pete Florence, Travis Armstrong, Jonathan Tompson, Pierre Sermanet, Jeannette Bohg, Debidatta Dwibedi

Our self-supervised representations are learned by observing the agent freely interacting with different parts of the environment and is queried in two different settings: (i) policy learning and (ii) object location prediction.

Object Object Localization +2

Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions

no code implementations29 Nov 2021 Bogdan Mazoure, Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

We show that performance of online algorithms for generalization in RL can be hindered in the offline setting due to poor estimation of similarity between observations.

Contrastive Learning Decision Making +5

Implicit Behavioral Cloning

4 code implementations1 Sep 2021 Pete Florence, Corey Lynch, Andy Zeng, Oscar Ramirez, Ayzaan Wahid, Laura Downs, Adrian Wong, Johnny Lee, Igor Mordatch, Jonathan Tompson

We find that across a wide range of robot policy learning scenarios, treating supervised policy learning with an implicit model generally performs better, on average, than commonly used explicit models.

D4RL

XIRL: Cross-embodiment Inverse Reinforcement Learning

1 code implementation7 Jun 2021 Kevin Zakka, Andy Zeng, Pete Florence, Jonathan Tompson, Jeannette Bohg, Debidatta Dwibedi

We investigate the visual cross-embodiment imitation setting, in which agents learn policies from videos of other agents (such as humans) demonstrating the same task, but with stark differences in their embodiments -- shape, actions, end-effector dynamics, etc.

reinforcement-learning Reinforcement Learning (RL)

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

2 code implementations14 Mar 2021 Ilya Kostrikov, Jonathan Tompson, Rob Fergus, Ofir Nachum

Many modern approaches to offline Reinforcement Learning (RL) utilize behavior regularization, typically augmenting a model-free actor critic algorithm with a penalty measuring divergence of the policy from the offline data.

Offline RL reinforcement-learning +1

Learning to Rearrange Deformable Cables, Fabrics, and Bags with Goal-Conditioned Transporter Networks

no code implementations6 Dec 2020 Daniel Seita, Pete Florence, Jonathan Tompson, Erwin Coumans, Vikas Sindhwani, Ken Goldberg, Andy Zeng

Goals cannot be as easily specified as rigid object poses, and may involve complex relative spatial relations such as "place the item inside the bag".

ADAIL: Adaptive Adversarial Imitation Learning

no code implementations23 Aug 2020 Yiren Lu, Jonathan Tompson

We present the ADaptive Adversarial Imitation Learning (ADAIL) algorithm for learning adaptive policies that can be transferred between environments of varying dynamics, by imitating a small number of demonstrations collected from a single source domain.

Imitation Learning

An Analysis of Object Representations in Deep Visual Trackers

no code implementations8 Jan 2020 Ross Goroshin, Jonathan Tompson, Debidatta Dwibedi

Despite these strong priors, we show that deep trackers often default to tracking by saliency detection - without relying on the object instance representation.

Object Saliency Detection +1

Imitation Learning via Off-Policy Distribution Matching

3 code implementations ICLR 2020 Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

In this work, we show how the original distribution ratio estimation objective may be transformed in a principled manner to yield a completely off-policy objective.

Imitation Learning Reinforcement Learning (RL)

Adaptive Adversarial Imitation Learning

no code implementations25 Sep 2019 Yiren Lu, Jonathan Tompson, Sergey Levine

We present the ADaptive Adversarial Imitation Learning (ADAIL) algorithm for learning adaptive policies that can be transferred between environments of varying dynamics, by imitating a small number of demonstrations collected from a single source domain.

Imitation Learning

Inducing Stronger Object Representations in Deep Visual Trackers

no code implementations25 Sep 2019 Ross Goroshin, Jonathan Tompson, Debidatta Dwibedi

Fully convolutional deep correlation networks are integral components of state-of- the-art approaches to single object visual tracking.

Object Saliency Detection +1

Beyond Photo Realism for Domain Adaptation from Synthetic Data

no code implementations4 Sep 2019 Kristofer Schlachter, Connor DeFanti, Sebastian Herscher, Ken Perlin, Jonathan Tompson

As synthetic imagery is used more frequently in training deep models, it is important to understand how different synthesis techniques impact the performance of such models.

Denoising Domain Adaptation +1

Learning Latent Plans from Play

1 code implementation5 Mar 2019 Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Jonathan Tompson, Sergey Levine, Pierre Sermanet

Learning from play (LfP) offers three main advantages: 1) It is cheap.

Robotics

Learning Actionable Representations from Visual Observations

no code implementations2 Aug 2018 Debidatta Dwibedi, Jonathan Tompson, Corey Lynch, Pierre Sermanet

In this work we explore a new approach for robots to teach themselves about the world simply by observing it.

Continuous Control

Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning

1 code implementation NeurIPS 2018 Supasorn Suwajanakorn, Noah Snavely, Jonathan Tompson, Mohammad Norouzi

We demonstrate this framework on 3D pose estimation by proposing a differentiable objective that seeks the optimal set of keypoints for recovering the relative pose between two views of an object.

3D Pose Estimation

Towards Accurate Multi-person Pose Estimation in the Wild

no code implementations CVPR 2017 George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, Kevin Murphy

Trained on COCO data alone, our final system achieves average precision of 0. 649 on the COCO test-dev set and the 0. 643 test-standard sets, outperforming the winner of the 2016 COCO keypoints challenge and other recent state-of-art.

Human Detection Keypoint Detection +1

Accelerating Eulerian Fluid Simulation With Convolutional Networks

1 code implementation ICML 2017 Jonathan Tompson, Kristofer Schlachter, Pablo Sprechmann, Ken Perlin

Efficient simulation of the Navier-Stokes equations for fluid flow is a long standing problem in applied mathematics, for which state-of-the-art methods require large compute resources.

Efficient ConvNet-Based Marker-Less Motion Capture in General Scenes With a Low Number of Cameras

no code implementations CVPR 2015 Ahmed Elhayek, Edilson de Aguiar, Arjun Jain, Jonathan Tompson, Leonid Pishchulin, Micha Andriluka, Chris Bregler, Bernt Schiele, Christian Theobalt

Our approach unites a discriminative image-based joint detection method with a model-based generative motion tracking algorithm through a combined pose optimization energy.

Pose Estimation

MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation

no code implementations28 Sep 2014 Arjun Jain, Jonathan Tompson, Yann Lecun, Christoph Bregler

In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features.

2D Human Pose Estimation Pose Estimation

Learning Human Pose Estimation Features with Convolutional Networks

1 code implementation27 Dec 2013 Arjun Jain, Jonathan Tompson, Mykhaylo Andriluka, Graham W. Taylor, Christoph Bregler

This paper introduces a new architecture for human pose estimation using a multi- layer convolutional network architecture and a modified learning technique that learns low-level features and higher-level weak spatial models.

Object Recognition Pose Estimation +2

Cannot find the paper you are looking for? You can Submit a new open access paper.