Search Results for author: Debidatta Dwibedi

Found 16 papers, 7 papers with code

FlexCap: Generating Rich, Localized, and Flexible Captions in Images

no code implementations18 Mar 2024 Debidatta Dwibedi, Vidhi Jain, Jonathan Tompson, Andrew Zisserman, Yusuf Aytar

The model, FlexCap, is trained to produce length-conditioned captions for input bounding boxes, and this allows control over the information density of its output, with descriptions ranging from concise object labels to detailed captions.

Attribute Dense Captioning +8

RT-H: Action Hierarchies Using Language

no code implementations4 Mar 2024 Suneel Belkhale, Tianli Ding, Ted Xiao, Pierre Sermanet, Quon Vuong, Jonathan Tompson, Yevgen Chebotar, Debidatta Dwibedi, Dorsa Sadigh

Predicting these language motions as an intermediate step between tasks and actions forces the policy to learn the shared structure of low-level motions across seemingly disparate tasks.

Imitation Learning

Q-Match: Self-Supervised Learning by Matching Distributions Induced by a Queue

1 code implementation10 Feb 2023 Thomas Mulc, Debidatta Dwibedi

In semi-supervised learning, student-teacher distribution matching has been successful in improving performance of models using unlabeled data in conjunction with few labeled samples.

Self-Supervised Learning

Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations

no code implementations12 May 2022 Negin Heravi, Ayzaan Wahid, Corey Lynch, Pete Florence, Travis Armstrong, Jonathan Tompson, Pierre Sermanet, Jeannette Bohg, Debidatta Dwibedi

Our self-supervised representations are learned by observing the agent freely interacting with different parts of the environment and is queried in two different settings: (i) policy learning and (ii) object location prediction.

Object Object Localization +2

XIRL: Cross-embodiment Inverse Reinforcement Learning

1 code implementation7 Jun 2021 Kevin Zakka, Andy Zeng, Pete Florence, Jonathan Tompson, Jeannette Bohg, Debidatta Dwibedi

We investigate the visual cross-embodiment imitation setting, in which agents learn policies from videos of other agents (such as humans) demonstrating the same task, but with stark differences in their embodiments -- shape, actions, end-effector dynamics, etc.

reinforcement-learning Reinforcement Learning (RL)

An Analysis of Object Representations in Deep Visual Trackers

no code implementations8 Jan 2020 Ross Goroshin, Jonathan Tompson, Debidatta Dwibedi

Despite these strong priors, we show that deep trackers often default to tracking by saliency detection - without relying on the object instance representation.

Object Saliency Detection +1

Inducing Stronger Object Representations in Deep Visual Trackers

no code implementations25 Sep 2019 Ross Goroshin, Jonathan Tompson, Debidatta Dwibedi

Fully convolutional deep correlation networks are integral components of state-of- the-art approaches to single object visual tracking.

Object Saliency Detection +1

Learning Actionable Representations from Visual Observations

no code implementations2 Aug 2018 Debidatta Dwibedi, Jonathan Tompson, Corey Lynch, Pierre Sermanet

In this work we explore a new approach for robots to teach themselves about the world simply by observing it.

Continuous Control

Deep Cuboid Detection: Beyond 2D Bounding Boxes

1 code implementation30 Nov 2016 Debidatta Dwibedi, Tomasz Malisiewicz, Vijay Badrinarayanan, Andrew Rabinovich

We present a Deep Cuboid Detector which takes a consumer-quality RGB image of a cluttered scene and localizes all 3D cuboids (box-like objects).

Cannot find the paper you are looking for? You can Submit a new open access paper.