Search Results for author: Pierre Sermanet

Found 30 papers, 11 papers with code

RT-H: Action Hierarchies Using Language

no code implementations4 Mar 2024 Suneel Belkhale, Tianli Ding, Ted Xiao, Pierre Sermanet, Quon Vuong, Jonathan Tompson, Yevgen Chebotar, Debidatta Dwibedi, Dorsa Sadigh

Predicting these language motions as an intermediate step between tasks and actions forces the policy to learn the shared structure of low-level motions across seemingly disparate tasks.

Imitation Learning

Video Language Planning

no code implementations16 Oct 2023 Yilun Du, Mengjiao Yang, Pete Florence, Fei Xia, Ayzaan Wahid, Brian Ichter, Pierre Sermanet, Tianhe Yu, Pieter Abbeel, Joshua B. Tenenbaum, Leslie Kaelbling, Andy Zeng, Jonathan Tompson

We are interested in enabling visual planning for complex long-horizon tasks in the space of generated videos and language, leveraging recent advances in large generative models pretrained on Internet-scale data.

Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models

no code implementations21 Nov 2022 Ted Xiao, Harris Chan, Pierre Sermanet, Ayzaan Wahid, Anthony Brohan, Karol Hausman, Sergey Levine, Jonathan Tompson

To accomplish this, we introduce Data-driven Instruction Augmentation for Language-conditioned control (DIAL): we utilize semi-supervised language labels leveraging the semantic understanding of CLIP to propagate knowledge onto large datasets of unlabelled demonstration data and then train language-conditioned policies on the augmented datasets.

Imitation Learning

Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations

no code implementations12 May 2022 Negin Heravi, Ayzaan Wahid, Corey Lynch, Pete Florence, Travis Armstrong, Jonathan Tompson, Pierre Sermanet, Jeannette Bohg, Debidatta Dwibedi

Our self-supervised representations are learned by observing the agent freely interacting with different parts of the environment and is queried in two different settings: (i) policy learning and (ii) object location prediction.

Object Object Localization +2

Broadly-Exploring, Local-Policy Trees for Long-Horizon Task Planning

no code implementations13 Oct 2020 Brian Ichter, Pierre Sermanet, Corey Lynch

This task space can be quite general and abstract; its only requirements are to be sampleable and to well-cover the space of useful tasks.

Motion Planning

Learning to Play by Imitating Humans

no code implementations11 Jun 2020 Rostam Dinyari, Pierre Sermanet, Corey Lynch

Acquiring multiple skills has commonly involved collecting a large number of expert demonstrations per task or engineering custom reward functions.

Motion2Vec: Semi-Supervised Representation Learning from Surgical Videos

no code implementations31 May 2020 Ajay Kumar Tanwani, Pierre Sermanet, Andy Yan, Raghav Anand, Mariano Phielipp, Ken Goldberg

We demonstrate the use of this representation to imitate surgical suturing motions from publicly available videos of the JIGSAWS dataset.

Action Segmentation Metric Learning +1

Language Conditioned Imitation Learning over Unstructured Data

no code implementations15 May 2020 Corey Lynch, Pierre Sermanet

Prior work in imitation learning typically requires each task be specified with a task id or goal image -- something that is often impractical in open-world environments.

Continuous Control Imitation Learning +2

Online Object Representations with Contrastive Learning

no code implementations10 Jun 2019 Sören Pirk, Mohi Khansari, Yunfei Bai, Corey Lynch, Pierre Sermanet

We propose a self-supervised approach for learning representations of objects from monocular videos and demonstrate it is particularly useful in situated settings such as robotics.

Contrastive Learning Object

Wasserstein Dependency Measure for Representation Learning

no code implementations NeurIPS 2019 Sherjil Ozair, Corey Lynch, Yoshua Bengio, Aaron van den Oord, Sergey Levine, Pierre Sermanet

Mutual information maximization has emerged as a powerful learning objective for unsupervised representation learning obtaining state-of-the-art performance in applications such as object recognition, speech recognition, and reinforcement learning.

Object Recognition reinforcement-learning +5

Learning Latent Plans from Play

1 code implementation5 Mar 2019 Corey Lynch, Mohi Khansari, Ted Xiao, Vikash Kumar, Jonathan Tompson, Sergey Levine, Pierre Sermanet

Learning from play (LfP) offers three main advantages: 1) It is cheap.

Robotics

Learning Actionable Representations from Visual Observations

no code implementations2 Aug 2018 Debidatta Dwibedi, Jonathan Tompson, Corey Lynch, Pierre Sermanet

In this work we explore a new approach for robots to teach themselves about the world simply by observing it.

Continuous Control

Time-Contrastive Networks: Self-Supervised Learning from Video

7 code implementations23 Apr 2017 Pierre Sermanet, Corey Lynch, Yevgen Chebotar, Jasmine Hsu, Eric Jang, Stefan Schaal, Sergey Levine

While representations are learned from an unlabeled collection of task-related videos, robot behaviors such as pouring are learned by watching a single 3rd-person demonstration by a human.

Metric Learning reinforcement-learning +3

Unsupervised Perceptual Rewards for Imitation Learning

no code implementations20 Dec 2016 Pierre Sermanet, Kelvin Xu, Sergey Levine

We present a method that is able to identify key intermediate steps of a task from only a handful of demonstration sequences, and automatically identify the most discriminative features for identifying these steps.

Imitation Learning Reinforcement Learning (RL)

Attention for Fine-Grained Categorization

no code implementations22 Dec 2014 Pierre Sermanet, Andrea Frome, Esteban Real

This paper presents experiments extending the work of Ba et al. (2014) on recurrent neural models for attention into less constrained visual environments, specifically fine-grained categorization on the Stanford Dogs data set.

Going Deeper with Convolutions

80 code implementations CVPR 2015 Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich

We propose a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014).

General Classification Image Classification +2

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

4 code implementations21 Dec 2013 Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann Lecun

This integrated framework is the winner of the localization task of the ImageNet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013) and obtained very competitive results for the detection and classifications tasks.

General Classification Image Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.