Search Results for author: Trevor Darrell

Found 310 papers, 170 papers with code

Unsupervised Learning of Visual Sense Models for Polysemous Words

no code implementations • NeurIPS 2008 • Kate Saenko, Trevor Darrell

Polysemy is a problem for methods that exploit image search engines to build object category models.

Paper
Add Code

An Additive Latent Feature Model for Transparent Object Recognition

no code implementations • NeurIPS 2009 • Mario Fritz, Gary Bradski, Sergey Karayev, Trevor Darrell, Michael J. Black

The appearance of a transparent patch is determined in part by the refraction of a background pattern through a transparent medium: the energy from the background usually dominates the patch appearance.

Object Object Recognition +2

Paper
Add Code

Learning to Hash with Binary Reconstructive Embeddings

no code implementations • NeurIPS 2009 • Brian Kulis, Trevor Darrell

Fast retrieval methods are increasingly critical for many large-scale analysis tasks, and there have been several recent methods that attempt to learn hash functions for fast and accurate nearest neighbor searches.

Retrieval

Paper
Add Code

Filtering Abstract Senses From Image Search Results

no code implementations • NeurIPS 2009 • Kate Saenko, Trevor Darrell

When faced with the task of learning a visual model based only on the name of an object, a common approach is to find images on the web that are associated with the object name, and then train a visual classifier from the search result.

Clustering Image Clustering +2

Paper
Add Code

Factorized Latent Spaces with Structured Sparsity

no code implementations • NeurIPS 2010 • Yangqing Jia, Mathieu Salzmann, Trevor Darrell

Recent approaches to multi-view learning have shown that factorizing the information into parts that are shared across all views and parts that are private to each view could effectively account for the dependencies and independencies between the different input modalities.

MULTI-VIEW LEARNING Pose Estimation

Paper
Add Code

Size Matters: Metric Visual Search Constraints from Monocular Metadata

no code implementations • NeurIPS 2010 • Mario Fritz, Kate Saenko, Trevor Darrell

Metric constraints are known to be highly discriminative for many objects, but if training is limited to data captured from a particular 3-D sensor the quantity of training data may be severly limited.

Paper
Add Code

Heavy-tailed Distances for Gradient Based Image Descriptors

no code implementations • NeurIPS 2011 • Yangqing Jia, Trevor Darrell

Many applications in computer vision measure the similarity between images or image patches based on some statistics such as oriented gradients.

Paper
Add Code

Learning with Recursive Perceptual Representations

no code implementations • NeurIPS 2012 • Oriol Vinyals, Yangqing Jia, Li Deng, Trevor Darrell

The use of random projections is key to our method, as we show in the experiments section, in which we observe a consistent improvement over previous --often more complicated-- methods on several vision and speech benchmarks.

Ranked #216 on Image Classification on CIFAR-10

Image Classification Object Recognition

Paper
Add Code

Timely Object Recognition

no code implementations • NeurIPS 2012 • Sergey Karayev, Tobias Baumgartner, Mario Fritz, Trevor Darrell

On the timeliness measure, our method obtains at least $11\%$ better performance.

Object object-detection +2

Paper
Add Code

Efficient Learning of Domain-invariant Image Representations

no code implementations • 15 Jan 2013 • Judy Hoffman, Erik Rodner, Jeff Donahue, Trevor Darrell, Kate Saenko

We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers.

Representation Learning

Paper
Add Code

Why Size Matters: Feature Coding as Nystrom Sampling

no code implementations • 15 Jan 2013 • Oriol Vinyals, Yangqing Jia, Trevor Darrell

Recently, the computer vision and machine learning community has been in favor of feature extraction pipelines that rely on a coding step followed by a linear classifier, due to their overall simplicity, well understood properties of linear classifiers, and their computational efficiency.

Computational Efficiency

Paper
Add Code

Semi-supervised Domain Adaptation with Instance Constraints

no code implementations • CVPR 2013 • Jeff Donahue, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell

Most successful object classification and detection methods rely on classifiers trained on large labeled datasets.

Domain Adaptation General Classification +5

Paper
Add Code

Towards Adapting ImageNet to Reality: Scalable Domain Adaptation with Implicit Low-rank Transformations

no code implementations • 20 Aug 2013 • Erik Rodner, Judy Hoffman, Jeff Donahue, Trevor Darrell, Kate Saenko

Images seen during test time are often not from the same distribution as images used for learning.

Domain Adaptation Scene Understanding

Paper
Add Code

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition

8 code implementations • 6 Oct 2013 • Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, Trevor Darrell

We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks.

Clustering Domain Adaptation +3

2,863

Paper
Code

Rich feature hierarchies for accurate object detection and semantic segmentation

29 code implementations • CVPR 2014 • Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik

We find that R-CNN outperforms OverFeat by a large margin on the 200-class ILSVRC2013 detection dataset.

Ranked #27 on Object Detection on PASCAL VOC 2007 (using extra training data)

Object Detection Semantic Segmentation

2,340

Paper
Code

Recognizing Image Style

1 code implementation • 15 Nov 2013 • Sergey Karayev, Matthew Trentacoste, Helen Han, Aseem Agarwala, Trevor Darrell, Aaron Hertzmann, Holger Winnemoeller

The style of an image plays a significant role in how it is viewed, but style has received little attention in computer vision research.

Image Retrieval TAG

Paper
Code

PANDA: Pose Aligned Networks for Deep Attribute Modeling

1 code implementation • CVPR 2014 • Ning Zhang, Manohar Paluri, Marc'Aurelio Ranzato, Trevor Darrell, Lubomir Bourdev

We propose a method for inferring human attributes (such as gender, hair style, clothes style, expression, action) from images of people under large variation of viewpoint, pose, appearance, articulation and occlusion.

Ranked #7 on Facial Attribute Classification on LFWA

Attribute Facial Attribute Classification +2

Paper
Code

Modeling Radiometric Uncertainty for Vision with Tone-mapped Color Images

no code implementations • 27 Nov 2013 • Ayan Chakrabarti, Ying Xiong, Baochen Sun, Trevor Darrell, Daniel Scharstein, Todd Zickler, Kate Saenko

To produce images that are suitable for display, tone-mapping is widely used in digital cameras to map linear color measurements into narrow gamuts with limited dynamic range.

Tone Mapping

Paper
Add Code

Deformable Part Descriptors for Fine-grained Recognition and Attribute Prediction

no code implementations • ICCV 2013 • Ning Zhang, Ryan Farrell, Forrest Iandola, Trevor Darrell

Recognizing objects in fine-grained domains can be extremely challenging due to the subtle differences between subcategories.

Ranked #25 on Fine-Grained Image Classification on CUB-200-2011

Attribute Fine-Grained Image Classification +1

Paper
Add Code

Visual Concept Learning: Combining Machine Vision and Bayesian Generalization on Concept Hierarchies

no code implementations • NeurIPS 2013 • Yangqing Jia, Joshua T. Abbott, Joseph L. Austerweil, Tom Griffiths, Trevor Darrell

Learning a visual concept from a small number of positive examples is a significant challenge for machine learning algorithms.

Paper
Add Code

One-Shot Adaptation of Supervised Deep Convolutional Models

no code implementations • 21 Dec 2013 • Judy Hoffman, Eric Tzeng, Jeff Donahue, Yangqing Jia, Kate Saenko, Trevor Darrell

In other words, are deep CNNs trained on large amounts of labeled data as susceptible to dataset bias as previous methods have been shown to be?

Domain Adaptation Image Classification

Paper
Add Code

On learning to localize objects with minimal supervision

no code implementations • 5 Mar 2014 • Hyun Oh Song, Ross Girshick, Stefanie Jegelka, Julien Mairal, Zaid Harchaoui, Trevor Darrell

Learning to localize objects with minimal supervision is an important problem in computer vision, since large fully annotated datasets are extremely costly to obtain.

Ranked #35 on Weakly Supervised Object Detection on PASCAL VOC 2007

Weakly Supervised Object Detection

Paper
Add Code

DenseNet: Implementing Efficient ConvNet Descriptor Pyramids

2 code implementations • 7 Apr 2014 • Forrest Iandola, Matt Moskewicz, Sergey Karayev, Ross Girshick, Trevor Darrell, Kurt Keutzer

Convolutional Neural Networks (CNNs) can provide accurate object classification.

General Classification Object +2

Paper
Code

Detection Bank: An Object Detection Based Video Representation for Multimedia Event Recognition

no code implementations • 28 May 2014 • Tim Althoff, Hyun Oh Song, Trevor Darrell

While low-level image features have proven to be effective representations for visual recognition tasks such as object recognition and scene classification, they are inadequate to capture complex semantic meaning required to solve high-level visual tasks such as multimedia event detection and recognition.

Event Detection Object +5

Paper
Add Code

Anytime Recognition of Objects and Scenes

no code implementations • CVPR 2014 • Sergey Karayev, Mario Fritz, Trevor Darrell

On suitable datasets, we can incorporate a semantic back-off strategy that gives maximally specific predictions for a desired level of accuracy; this provides a new view on the time course of human visual perception.

General Classification Object Recognition

Paper
Add Code

Continuous Manifold Based Adaptation for Evolving Visual Domains

no code implementations • CVPR 2014 • Judy Hoffman, Trevor Darrell, Kate Saenko

The classic domain adaptation paradigm considers the world to be separated into stationary domains with clear boundaries between them.

Domain Adaptation

Paper
Add Code

Learning Scalable Discriminative Dictionary with Sample Relatedness

no code implementations • CVPR 2014 • Jiashi Feng, Stefanie Jegelka, Shuicheng Yan, Trevor Darrell

We use sample relatedness information to improve the generalization of the learned dictionary.

Dictionary Learning Image Retrieval +3

Paper
Add Code

Caffe: Convolutional Architecture for Fast Feature Embedding

2 code implementations • 20 Jun 2014 • Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell

The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

Clustering Dimensionality Reduction +1

33,859

Paper
Code

Weakly-supervised Discovery of Visual Pattern Configurations

no code implementations • NeurIPS 2014 • Hyun Oh Song, Yong Jae Lee, Stefanie Jegelka, Trevor Darrell

The increasing prominence of weakly labeled data nurtures a growing demand for object detection methods that can cope with minimal supervision.

Object object-detection +1

Paper
Add Code

Part-based R-CNNs for Fine-grained Category Detection

no code implementations • 15 Jul 2014 • Ning Zhang, Jeff Donahue, Ross Girshick, Trevor Darrell

Semantic part localization can facilitate fine-grained categorization by explicitly isolating subtle appearance differences associated with specific object parts.

Ranked #63 on Fine-Grained Image Classification on CUB-200-2011

Fine-Grained Image Classification Object +2

Paper
Add Code

LSDA: Large Scale Detection Through Adaptation

1 code implementation • NeurIPS 2014 • Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross Girshick, Trevor Darrell, Kate Saenko

A major challenge in scaling object detection is the difficulty of obtaining labeled images for large numbers of categories.

Classification General Classification +2

Paper
Code

Deformable Part Models are Convolutional Neural Networks

1 code implementation • CVPR 2015 • Ross Girshick, Forrest Iandola, Trevor Darrell, Jitendra Malik

Deformable part models (DPMs) and convolutional neural networks (CNNs) are two widely used tools for visual recognition.

Ranked #28 on Object Detection on PASCAL VOC 2007

Object Detection Rolling Shutter Correction

128

Paper
Code

DeepSentiBank: Visual Sentiment Concept Classification with Deep Convolutional Neural Networks

1 code implementation • 30 Oct 2014 • Tao Chen, Damian Borth, Trevor Darrell, Shih-Fu Chang

Nearly one million Flickr images tagged with these ANPs are downloaded to train the classifiers of the concepts.

Classification General Classification +1

Paper
Code

Do Convnets Learn Correspondence?

no code implementations • NeurIPS 2014 • Jonathan Long, Ning Zhang, Trevor Darrell

Convolutional neural nets (convnets) trained from massive labeled datasets have substantially improved the state-of-the-art in image classification and object detection.

Ranked #4 on Keypoint Detection on Pascal3D+

General Classification Image Classification +3

Paper
Add Code

Fully Convolutional Networks for Semantic Segmentation

53 code implementations • CVPR 2015 • Jonathan Long, Evan Shelhamer, Trevor Darrell

Convolutional networks are powerful visual models that yield hierarchies of features.

Ranked #2 on Semantic Segmentation on SkyScapes-Lane

Multi-tissue Nucleus Segmentation Segmentation +2

394

Paper
Code

Long-term Recurrent Convolutional Networks for Visual Recognition and Description

7 code implementations • CVPR 2015 • Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, Trevor Darrell

Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent, or "temporally deep", are effective for tasks involving sequences, visual and otherwise.

Ranked #3 on Human Interaction Recognition on BIT

Retrieval Video Recognition

Paper
Code

Detector Discovery in the Wild: Joint Multiple Instance and Representation Learning

no code implementations • CVPR 2015 • Judy Hoffman, Deepak Pathak, Trevor Darrell, Kate Saenko

We develop methods for detector learning which exploit joint training over both weak and strong labels and which transfer learned perceptual representations from strongly-labeled auxiliary tasks.

Multiple Instance Learning Representation Learning +1

Paper
Add Code

Deep Domain Confusion: Maximizing for Domain Invariance

7 code implementations • 10 Dec 2014 • Eric Tzeng, Judy Hoffman, Ning Zhang, Kate Saenko, Trevor Darrell

Recent reports suggest that a generic supervised deep CNN model trained on a large-scale dataset reduces, but does not remove, dataset bias on a standard benchmark.

Ranked #6 on Domain Adaptation on Office-Caltech

Domain Adaptation Model Selection +1

Paper
Code

Fully Convolutional Multi-Class Multiple Instance Learning

1 code implementation • 22 Dec 2014 • Deepak Pathak, Evan Shelhamer, Jonathan Long, Trevor Darrell

We propose a novel MIL formulation of multi-class semantic segmentation learning by a fully convolutional network.

Multiple Instance Learning Segmentation +1

Paper
Code

Learning Compact Convolutional Neural Networks with Nested Dropout

no code implementations • 22 Dec 2014 • Chelsea Finn, Lisa Anne Hendricks, Trevor Darrell

Recently, nested dropout was proposed as a method for ordering representation units in autoencoders by their information content, without diminishing reconstruction cost.

Paper
Add Code

End-to-End Training of Deep Visuomotor Policies

no code implementations • 2 Apr 2015 • Sergey Levine, Chelsea Finn, Trevor Darrell, Pieter Abbeel

Policy search methods can allow robots to learn control policies for a wide range of tasks, but practical applications of policy search often require hand-engineered components for perception, state estimation, and low-level control.

Paper
Add Code

Sequence to Sequence -- Video to Text

4 code implementations • 3 May 2015 • Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue, Raymond Mooney, Trevor Darrell, Kate Saenko

Our LSTM model is trained on video-sentence pairs and learns to associate a sequence of video frames to a sequence of words in order to generate a description of the event in the video clip.

Caption Generation Language Modelling +1

Paper
Code

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation

1 code implementation • ICCV 2015 • Deepak Pathak, Philipp Krähenbühl, Trevor Darrell

We propose Constrained CNN (CCNN), a method which uses a novel loss function to optimize for any set of linear constraints on the output space (i. e. predicted label distribution) of a CNN.

Image Segmentation Semantic Segmentation +2

Paper
Code

Deep Spatial Autoencoders for Visuomotor Learning

1 code implementation • 21 Sep 2015 • Chelsea Finn, Xin Yu Tan, Yan Duan, Trevor Darrell, Sergey Levine, Pieter Abbeel

Our method uses a deep spatial autoencoder to acquire a set of feature points that describe the environment for the current task, such as the positions of objects, and then learns a motion skill with these feature points using an efficient reinforcement learning method based on local linear models.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Simultaneous Deep Transfer Across Domains and Tasks

1 code implementation • ICCV 2015 • Eric Tzeng, Judy Hoffman, Trevor Darrell, Kate Saenko

Recent reports suggest that a generic supervised deep CNN model trained on a large-scale dataset reduces, but does not remove, dataset bias.

Domain Adaptation

Paper
Code

Spatial Semantic Regularisation for Large Scale Object Detection

no code implementations • ICCV 2015 • Damian Mrowca, Marcus Rohrbach, Judy Hoffman, Ronghang Hu, Kate Saenko, Trevor Darrell

Our approach proves to be especially useful in large scale settings with thousands of classes, where spatial and semantic interactions are very frequent and only weakly supervised detectors can be built due to a lack of bounding box annotations.

Clustering Object +2

Paper
Add Code

Quantification in-the-wild: data-sets and baselines

no code implementations • 16 Oct 2015 • Oscar Beijbom, Judy Hoffman, Evan Yao, Trevor Darrell, Alberto Rodriguez-Ramirez, Manuel Gonzalez-Rivero, Ove Hoegh - Guldberg

Quantification is the task of estimating the class-distribution of a data-set.

Time Series Time Series Analysis

Paper
Add Code

Neural Module Networks

1 code implementation • CVPR 2016 • Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Dan Klein

Visual question answering is fundamentally compositional in nature---a question like "where is the dog?"

Ranked #6 on Visual Question Answering (VQA) on VQA v1 test-std

Visual Question Answering

403

Paper
Code

Grounding of Textual Phrases in Images by Reconstruction

3 code implementations • 12 Nov 2015 • Anna Rohrbach, Marcus Rohrbach, Ronghang Hu, Trevor Darrell, Bernt Schiele

We propose a novel approach which learns grounding by reconstructing a given phrase using an attention mechanism, which can be either latent or optimized directly.

Ranked #12 on Phrase Grounding on Flickr30k Entities Test

Language Modelling Natural Language Visual Grounding +2

218

Paper
Code

Natural Language Object Retrieval

1 code implementation • CVPR 2016 • Ronghang Hu, Huazhe Xu, Marcus Rohrbach, Jiashi Feng, Kate Saenko, Trevor Darrell

In this paper, we address the task of natural language object retrieval, to localize a target object within a given image based on a natural language query of the object.

Ranked #12 on Referring Expression Comprehension on Talk2Car

Image Captioning Image Retrieval +4

112

Paper
Code

Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data

1 code implementation • CVPR 2016 • Lisa Anne Hendricks, Subhashini Venugopalan, Marcus Rohrbach, Raymond Mooney, Kate Saenko, Trevor Darrell

Current deep caption models can only describe objects contained in paired image-sentence corpora, despite the fact that they are pre-trained with large object recognition datasets, namely ImageNet.

Image Captioning Novel Concepts +3

Paper
Code

Deep Learning for Tactile Understanding From Visual and Haptic Data

no code implementations • 19 Nov 2015 • Yang Gao, Lisa Anne Hendricks, Katherine J. Kuchenbecker, Trevor Darrell

Robots which interact with the physical world will benefit from a fine-grained tactile understanding of objects and surfaces.

Paper
Add Code

Compact Bilinear Pooling

6 code implementations • CVPR 2016 • Yang Gao, Oscar Beijbom, Ning Zhang, Trevor Darrell

Bilinear models has been shown to achieve impressive performance on a wide range of visual tasks, such as semantic segmentation, fine grained recognition and face recognition.

Face Recognition Few-Shot Learning +3

218

Paper
Code

Data-dependent Initializations of Convolutional Neural Networks

2 code implementations • 21 Nov 2015 • Philipp Krähenbühl, Carl Doersch, Jeff Donahue, Trevor Darrell

Convolutional Neural Networks spread through computer vision like a wildfire, impacting almost all visual tasks imaginable.

Image Classification object-detection +2

138

Paper
Code

Mapping Images to Sentiment Adjective Noun Pairs with Factorized Neural Nets

no code implementations • 21 Nov 2015 • Takuya Narihira, Damian Borth, Stella X. Yu, Karl Ni, Trevor Darrell

We consider the visual sentiment task of mapping an image to an adjective noun pair (ANP) such as "cute baby".

Image Captioning

Paper
Add Code

Fine-grained pose prediction, normalization, and recognition

no code implementations • 22 Nov 2015 • Ning Zhang, Evan Shelhamer, Yang Gao, Trevor Darrell

Pose variation and subtle differences in appearance are key challenges to fine-grained classification.

General Classification Pose Prediction

Paper
Add Code

Auxiliary Image Regularization for Deep CNNs with Noisy Labels

no code implementations • 22 Nov 2015 • Samaneh Azadi, Jiashi Feng, Stefanie Jegelka, Trevor Darrell

Precisely-labeled data sets with sufficient amount of samples are very important for training deep convolutional neural networks (CNNs).

Image Classification

Paper
Add Code

Adapting Deep Visuomotor Representations with Weak Pairwise Constraints

no code implementations • 23 Nov 2015 • Eric Tzeng, Coline Devin, Judy Hoffman, Chelsea Finn, Pieter Abbeel, Sergey Levine, Kate Saenko, Trevor Darrell

We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains.

Domain Adaptation

Paper
Add Code

Constrained Structured Regression with Convolutional Neural Networks

no code implementations • 23 Nov 2015 • Deepak Pathak, Philipp Krähenbühl, Stella X. Yu, Trevor Darrell

We present a regression framework which models the output distribution of neural networks.

Intrinsic Image Decomposition regression

Paper
Add Code

Sequence to Sequence - Video to Text

no code implementations • ICCV 2015 • Subhashini Venugopalan, Marcus Rohrbach, Jeffrey Donahue, Raymond Mooney, Trevor Darrell, Kate Saenko

Our LSTM model is trained on video-sentence pairs and learns to associate a sequence of video frames to a sequence of words in order to generate a description of the event in the video clip.

Caption Generation Language Modelling +1

Paper
Add Code

Learning The Structure of Deep Convolutional Networks

no code implementations • ICCV 2015 • Jiashi Feng, Trevor Darrell

In this work, we develop a novel method for automatically learning aspects of the structure of a deep model, in order to improve its performance, especially when labeled training data are scarce.

Semi-Supervised Image Classification

Paper
Add Code

Learning to Compose Neural Networks for Question Answering

3 code implementations • NAACL 2016 • Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Dan Klein

We describe a question answering model that applies to both images and structured knowledge bases.

Question Answering reinforcement-learning +1

403

Paper
Code

Segmentation from Natural Language Expressions

4 code implementations • 20 Mar 2016 • Ronghang Hu, Marcus Rohrbach, Trevor Darrell

To produce pixelwise segmentation for the language expression, we propose an end-to-end trainable recurrent and convolutional network model that jointly learns to process visual and linguistic information.

Ranked #16 on Referring Expression Segmentation on J-HMDB

Referring Expression Segmentation Segmentation +1

Paper
Code

Generating Visual Explanations

no code implementations • 28 Mar 2016 • Lisa Anne Hendricks, Zeynep Akata, Marcus Rohrbach, Jeff Donahue, Bernt Schiele, Trevor Darrell

Clearly explaining a rationale for a classification decision to an end-user can be as important as the decision itself.

General Classification Sentence +1

Paper
Add Code

Context Encoders: Feature Learning by Inpainting

11 code implementations • CVPR 2016 • Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, Alexei A. Efros

In order to succeed at this task, context encoders need to both understand the content of the entire image, as well as produce a plausible hypothesis for the missing part(s).

15,701

Paper
Code

Fully Convolutional Networks for Semantic Segmentation

40 code implementations • CVPR 2015 • Evan Shelhamer, Jonathan Long, Trevor Darrell

Convolutional networks are powerful visual models that yield hierarchies of features.

Ranked #2 on Semantic Segmentation on NYU Depth v2 (Mean Accuracy metric)

Real-Time Semantic Segmentation Scene Segmentation +2

15,429

Paper
Code

Adversarial Feature Learning

10 code implementations • 31 May 2016 • Jeff Donahue, Philipp Krähenbühl, Trevor Darrell

The ability of the Generative Adversarial Networks (GANs) framework to learn generative models mapping from simple latent distributions to arbitrarily complex data distributions has been demonstrated empirically, with compelling results showing that the latent space of such generators captures semantic variation in the data distribution.

9,136

Paper
Code

Learning With Side Information Through Modality Hallucination

no code implementations • CVPR 2016 • Judy Hoffman, Saurabh Gupta, Trevor Darrell

Thus, our method transfers information commonly extracted from depth training data to a network which can extract that information from the RGB counterpart.

Hallucination object-detection +1

Paper
Add Code

Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding

10 code implementations • EMNLP 2016 • Akira Fukui, Dong Huk Park, Daylen Yang, Anna Rohrbach, Trevor Darrell, Marcus Rohrbach

Approaches to multimodal pooling include element-wise product or sum, as well as concatenation of the visual and textual representations.

Ranked #1 on Visual Question Answering (VQA) on COCO Visual Question Answering (VQA) real images 1.0 multiple choice

Visual Grounding Visual Question Answering

699

Paper
Code

Captioning Images with Diverse Objects

1 code implementation • CVPR 2017 • Subhashini Venugopalan, Lisa Anne Hendricks, Marcus Rohrbach, Raymond Mooney, Trevor Darrell, Kate Saenko

We propose minimizing a joint objective which can learn from these diverse data sources and leverage distributional semantic embeddings, enabling the model to generalize and describe novel objects outside of image-caption datasets.

Object Object Recognition

Paper
Code

Clockwork Convnets for Video Semantic Segmentation

1 code implementation • 11 Aug 2016 • Evan Shelhamer, Kate Rakelly, Judy Hoffman, Trevor Darrell

Recent years have seen tremendous progress in still-image segmentation; however the na\"ive application of these state-of-the-art algorithms to every video frame requires considerable computation and ignores the temporal continuity inherent in video.

Image Segmentation Scheduling +4

141

Paper
Code

Utilizing Large Scale Vision and Text Datasets for Image Segmentation from Referring Expressions

no code implementations • 30 Aug 2016 • Ronghang Hu, Marcus Rohrbach, Subhashini Venugopalan, Trevor Darrell

Image segmentation from referring expressions is a joint vision and language modeling task, where the input is an image and a textual expression describing a particular region in the image; and the goal is to localize and segment the specific image region based on the given expression.

Image Captioning Image Segmentation +3

Paper
Add Code

Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer

no code implementations • 22 Sep 2016 • Coline Devin, Abhishek Gupta, Trevor Darrell, Pieter Abbeel, Sergey Levine

Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

Modeling Relationships in Referential Expressions with Compositional Modular Networks

2 code implementations • CVPR 2017 • Ronghang Hu, Marcus Rohrbach, Jacob Andreas, Trevor Darrell, Kate Saenko

In this paper we instead present a modular deep architecture capable of analyzing referential expressions into their component parts, identifying entities and relationships mentioned in the input expression and grounding them all in the scene.

Ranked #1 on Visual Question Answering (VQA) on Visual7W

Visual Question Answering (VQA)

745

Paper
Code

End-to-end Learning of Driving Models from Large-scale Video Datasets

2 code implementations • CVPR 2017 • Huazhe Xu, Yang Gao, Fisher Yu, Trevor Darrell

Robust perception-action models should be learned from training data with diverse visual appearances and realistic behaviors, yet current approaches to deep visuomotor policy learning have been generally limited to in-situ models learned from a single vehicle or a simulation environment.

Scene Segmentation

218

Paper
Code

FCNs in the Wild: Pixel-level Adversarial and Constraint-based Adaptation

3 code implementations • 8 Dec 2016 • Judy Hoffman, Dequan Wang, Fisher Yu, Trevor Darrell

In this paper, we introduce the first domain adaptive semantic segmentation method, proposing an unsupervised adversarial approach to pixel prediction problems.

Ranked #2 on Image-to-Image Translation on SYNTHIA Fall-to-Winter

Semantic Segmentation Synthetic-to-Real Translation

Paper
Code

Attentive Explanations: Justifying Decisions and Pointing to the Evidence

no code implementations • 14 Dec 2016 • Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Bernt Schiele, Trevor Darrell, Marcus Rohrbach

In contrast, humans can justify their decisions with natural language and point to the evidence in the visual world which led to their decisions.

Decision Making Question Answering +2

Paper
Add Code

Learning Features by Watching Objects Move

1 code implementation • CVPR 2017 • Deepak Pathak, Ross Girshick, Piotr Dollár, Trevor Darrell, Bharath Hariharan

Given the extensive evidence that motion plays a key role in the development of the human visual system, we hope that this straightforward approach to unsupervised learning will be more effective than cleverly designed 'pretext' tasks studied in the literature.

object-detection Object Detection +1

259

Paper
Code

Loss is its own Reward: Self-Supervision for Reinforcement Learning

no code implementations • 21 Dec 2016 • Evan Shelhamer, Parsa Mahmoudieh, Max Argus, Trevor Darrell

Reinforcement learning optimizes policies for expected cumulative reward.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Visual Discovery at Pinterest

no code implementations • 15 Feb 2017 • Andrew Zhai, Dmitry Kislyuk, Yushi Jing, Michael Feng, Eric Tzeng, Jeff Donahue, Yue Li Du, Trevor Darrell

Over the past three years Pinterest has experimented with several visual search and recommendation services, including Related Pins (2014), Similar Looks (2015), Flashlight (2016) and Lens (2017).

object-detection Object Detection

Paper
Add Code

Adversarial Discriminative Domain Adaptation

20 code implementations • CVPR 2017 • Eric Tzeng, Judy Hoffman, Kate Saenko, Trevor Darrell

Adversarial learning methods are a promising approach to training robust deep networks, and can generate complex samples across diverse domains.

Ranked #3 on Unsupervised Image-To-Image Translation on SVNH-to-MNIST

General Classification Unsupervised Domain Adaptation +1

3,144

Paper
Code

Learning Detection with Diverse Proposals

1 code implementation • CVPR 2017 • Samaneh Azadi, Jiashi Feng, Trevor Darrell

To predict a set of diverse and informative proposals with enriched representations, this paper introduces a differentiable Determinantal Point Process (DPP) layer that is able to augment the object detection architectures.

Object object-detection +1

Paper
Code

Learning to Reason: End-to-End Module Networks for Visual Question Answering

1 code implementation • ICCV 2017 • Ronghang Hu, Jacob Andreas, Marcus Rohrbach, Trevor Darrell, Kate Saenko

Natural language questions are inherently compositional, and many are most easily answered by reasoning about their decomposition into modular sub-problems.

Ranked #43 on Visual Question Answering (VQA) on VQA v2 test-dev

Visual Dialog Visual Question Answering

270

Paper
Code

Generalized orderless pooling performs implicit salient matching

2 code implementations • ICCV 2017 • Marcel Simon, Yang Gao, Trevor Darrell, Joachim Denzler, Erik Rodner

In this paper, we generalize average and bilinear pooling to "alpha-pooling", allowing for learning the pooling strategy during training.

Paper
Code

Curiosity-driven Exploration by Self-supervised Prediction

14 code implementations • ICML 2017 • Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, Trevor Darrell

In many real-world scenarios, rewards extrinsic to the agent are extremely sparse, or absent altogether.

Ranked #2 on Unsupervised Reinforcement Learning on URLB (pixels, 10^5 frames)

Unsupervised Reinforcement Learning

31,072

Paper
Code

Deep Layer Aggregation

7 code implementations • CVPR 2018 • Fisher Yu, Dequan Wang, Evan Shelhamer, Trevor Darrell

We augment standard architectures with deeper aggregation to better fuse information across layers.

Image Classification

29,735

Paper
Code

Localizing Moments in Video with Natural Language

2 code implementations • ICCV 2017 • Lisa Anne Hendricks, Oliver Wang, Eli Shechtman, Josef Sivic, Trevor Darrell, Bryan Russell

A key obstacle to training our MCN model is that current video datasets do not include pairs of localized video segments and referring expressions, or text descriptions which uniquely identify a corresponding moment.

Natural Language Queries

182

Paper
Code

Deep Object-Centric Representations for Generalizable Robot Learning

1 code implementation • 14 Aug 2017 • Coline Devin, Pieter Abbeel, Trevor Darrell, Sergey Levine

We devise an object-level attentional mechanism that can be used to determine relevant objects from a few trajectories or demonstrations, and then immediately incorporate those objects into a learned policy.

Object Reinforcement Learning (RL)

Paper
Code

Fooling Vision and Language Models Despite Localization and Attention Mechanism

no code implementations • CVPR 2018 • Xiaojun Xu, Xinyun Chen, Chang Liu, Anna Rohrbach, Trevor Darrell, Dawn Song

Our work sheds new light on understanding adversarial attacks on vision systems which have a language component and shows that attention, bounding box localization, and compositional internal structures are vulnerable to adversarial attacks.

Dense Captioning Natural Language Understanding +2

Paper
Add Code

Gradient-free Policy Architecture Search and Adaptation

no code implementations • 16 Oct 2017 • Sayna Ebrahimi, Anna Rohrbach, Trevor Darrell

We develop a method for policy architecture search and adaptation via gradient-free optimization which can learn to perform autonomous driving tasks.

Autonomous Driving Neural Architecture Search

Paper
Add Code

CyCADA: Cycle-Consistent Adversarial Domain Adaptation

3 code implementations • ICML 2018 • Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei A. Efros, Trevor Darrell

Domain adaptation is critical for success in new, unseen environments.

Ranked #1 on Unsupervised Image-To-Image Translation on SVNH-to-MNIST

Domain Adaptation Semantic Segmentation +2

3,144

Paper
Code

Attentive Explanations: Justifying Decisions and Pointing to the Evidence (Extended Abstract)

no code implementations • 17 Nov 2017 • Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Anna Rohrbach, Bernt Schiele, Trevor Darrell, Marcus Rohrbach

We also introduce a multimodal methodology for generating visual and textual explanations simultaneously.

Question Answering Visual Question Answering (VQA)

Paper
Add Code

Grounding Visual Explanations (Extended Abstract)

no code implementations • 17 Nov 2017 • Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata

Existing models which generate textual explanations enforce task relevance through a discriminative term loss function, but such mechanisms only weakly constrain mentioned object parts to actually be present in the image.

Attribute

Paper
Add Code

SkipNet: Learning Dynamic Routing in Convolutional Networks

2 code implementations • ECCV 2018 • Xin Wang, Fisher Yu, Zi-Yi Dou, Trevor Darrell, Joseph E. Gonzalez

While deeper convolutional networks are needed to achieve maximum accuracy in visual perception tasks, for many inputs shallower networks are sufficient.

Decision Making

233

Paper
Code

Learning to Segment Every Thing

3 code implementations • CVPR 2018 • Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, Ross Girshick

Most methods for object instance segmentation require all training examples to be labeled with segmentation masks.

Instance Segmentation Segmentation +1

26,137

Paper
Code

Toward Multimodal Image-to-Image Translation

6 code implementations • NeurIPS 2017 • Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A. Efros, Oliver Wang, Eli Shechtman

Our proposed method encourages bijective consistency between the latent encoding and output modes.

Ranked #2 on Multimodal Unsupervised Image-To-Image Translation on Edge-to-Shoes

Image-to-Image Translation Translation

15,701

Paper
Code

Multi-Content GAN for Few-Shot Font Style Transfer

6 code implementations • CVPR 2018 • Samaneh Azadi, Matthew Fisher, Vladimir Kim, Zhaowen Wang, Eli Shechtman, Trevor Darrell

In this work, we focus on the challenge of taking partial observations of highly-stylized text and generalizing the observations to generate unobserved glyphs in the ornamented typeface.

Font Style Transfer

442

Paper
Code

Recasting Gradient-Based Meta-Learning as Hierarchical Bayes

no code implementations • ICLR 2018 • Erin Grant, Chelsea Finn, Sergey Levine, Trevor Darrell, Thomas Griffiths

Meta-learning allows an intelligent agent to leverage prior learning episodes as a basis for quickly improving performance on a novel task.

Meta-Learning

Paper
Add Code

Reinforcement Learning from Imperfect Demonstrations

no code implementations • ICLR 2018 • Yang Gao, Huazhe Xu, Ji Lin, Fisher Yu, Sergey Levine, Trevor Darrell

We propose a unified reinforcement learning algorithm, Normalized Actor-Critic (NAC), that effectively normalizes the Q-function, reducing the Q-values of actions unseen in the demonstration data.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Multimodal Explanations: Justifying Decisions and Pointing to the Evidence

1 code implementation • CVPR 2018 • Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Anna Rohrbach, Bernt Schiele, Trevor Darrell, Marcus Rohrbach

We propose a multimodal approach to explanation, and argue that the two modalities provide complementary explanatory strengths.

Activity Recognition Explainable Models +2

Paper
Code

Women also Snowboard: Overcoming Bias in Captioning Models

2 code implementations • ECCV 2018 • Kaylee Burns, Lisa Anne Hendricks, Kate Saenko, Trevor Darrell, Anna Rohrbach

We introduce a new Equalizer model that ensures equal gender probability when gender evidence is occluded in a scene and confident predictions when gender evidence is present.

Image Captioning

169

Paper
Code

Zero-Shot Visual Imitation

1 code implementation • ICLR 2018 • Deepak Pathak, Parsa Mahmoudieh, Guanghao Luo, Pulkit Agrawal, Dian Chen, Yide Shentu, Evan Shelhamer, Jitendra Malik, Alexei A. Efros, Trevor Darrell

In our framework, the role of the expert is only to communicate the goals (i. e., what to imitate) during inference.

Imitation Learning

203

Paper
Code

BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

4 code implementations • CVPR 2020 • Fisher Yu, Haofeng Chen, Xin Wang, Wenqi Xian, Yingying Chen, Fangchen Liu, Vashisht Madhavan, Trevor Darrell

Datasets drive vision progress, yet existing driving datasets are impoverished in terms of visual content and supported tasks to study multitask learning for autonomous driving.

Ranked #5 on Multiple Object Tracking on BDD100K test

Autonomous Driving Domain Adaptation +8

395

Paper
Code

Few-Shot Segmentation Propagation with Guided Networks

1 code implementation • 25 May 2018 • Kate Rakelly, Evan Shelhamer, Trevor Darrell, Alexei A. Efros, Sergey Levine

Learning-based methods for visual segmentation have made progress on particular types of segmentation tasks, but are limited by the necessary supervision, the narrow definitions of fixed tasks, and the lack of control during inference for correcting errors.

Interactive Segmentation Segmentation +3

154

Paper
Code

Deep Mixture of Experts via Shallow Embedding

no code implementations • 5 Jun 2018 • Xin Wang, Fisher Yu, Lisa Dunlap, Yi-An Ma, Ruth Wang, Azalia Mirhoseini, Trevor Darrell, Joseph E. Gonzalez

Larger networks generally have greater representational power at the cost of increased computational complexity.

Few-Shot Learning Zero-Shot Learning

Paper
Add Code

Speaker-Follower Models for Vision-and-Language Navigation

1 code implementation • NeurIPS 2018 • Daniel Fried, Ronghang Hu, Volkan Cirik, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein, Trevor Darrell

We use this speaker model to (1) synthesize new instructions for data augmentation and to (2) implement pragmatic reasoning, which evaluates how well candidate action sequences explain an instruction.

Data Augmentation Vision and Language Navigation

124

Paper
Code

Learning Instance Segmentation by Interaction

1 code implementation • 21 Jun 2018 • Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, Sergey Levine, Jitendra Malik

The agent uses its current segmentation model to infer pixels that constitute objects and refines the segmentation model by interacting with these pixels.

Instance Segmentation Segmentation +1

Paper
Code

Generating Counterfactual Explanations with Natural Language

no code implementations • 26 Jun 2018 • Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata

We call such textual explanations counterfactual explanations, and propose an intuitive method to generate counterfactual explanations by inspecting which evidence in an input is missing, but might contribute to a different classification decision if present in the image.

Classification counterfactual +2

Paper
Add Code

Women also Snowboard: Overcoming Bias in Captioning Models (Extended Abstract)

no code implementations • 2 Jul 2018 • Lisa Anne Hendricks, Kaylee Burns, Kate Saenko, Trevor Darrell, Anna Rohrbach

Most machine learning methods are known to capture and exploit biases of the training data.

Image Captioning

Paper
Add Code

Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees

2 code implementations • ICLR 2019 • Yuping Luo, Huazhe Xu, Yuanzhi Li, Yuandong Tian, Trevor Darrell, Tengyu Ma

Model-based reinforcement learning (RL) is considered to be a promising approach to reduce the sample complexity that hinders model-free RL.

Continuous Control Model-based Reinforcement Learning +3

Paper
Code

Compositional GAN: Learning Image-Conditional Binary Composition

1 code implementation • 19 Jul 2018 • Samaneh Azadi, Deepak Pathak, Sayna Ebrahimi, Trevor Darrell

Generative Adversarial Networks (GANs) can produce images of remarkable complexity and realism but are generally structured to sample from a single latent source ignoring the explicit spatial interaction between multiple entities that could be present in a scene.

Paper
Code

Explainable Neural Computation via Stack Neural Module Networks

1 code implementation • ECCV 2018 • Ronghang Hu, Jacob Andreas, Trevor Darrell, Kate Saenko

In complex inferential tasks like question answering, machine learning models must confront two challenges: the need to implement a compositional reasoning process, and, in many applications, the need for this reasoning process to be interpretable to assist users in both development and prediction.

Ranked #14 on Referring Expression Comprehension on Talk2Car

Decision Making Question Answering +1

Paper
Code

Grounding Visual Explanations

no code implementations • ECCV 2018 • Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata

Our model improves the textual explanation quality of fine-grained classification decisions on the CUB dataset by mentioning phrases that are grounded in the image.

General Classification Sentence

Paper
Add Code

Textual Explanations for Self-Driving Vehicles

2 code implementations • ECCV 2018 • Jinkyu Kim, Anna Rohrbach, Trevor Darrell, John Canny, Zeynep Akata

Finally, we explore a version of our model that generates rationalizations, and compare with introspective explanations on the same video segments.

Paper
Code

Large-Scale Study of Curiosity-Driven Learning

4 code implementations • ICLR 2019 • Yuri Burda, Harri Edwards, Deepak Pathak, Amos Storkey, Trevor Darrell, Alexei A. Efros

However, annotating each environment with hand-designed, dense rewards is not scalable, motivating the need for developing reward functions that are intrinsic to the agent.

Ranked #14 on Atari Games on Atari 2600 Montezuma's Revenge

Atari Games SNES Games

799

Paper
Code

Localizing Moments in Video with Temporal Language

1 code implementation • EMNLP 2018 • Lisa Anne Hendricks, Oliver Wang, Eli Shechtman, Josef Sivic, Trevor Darrell, Bryan Russell

To benchmark whether our model, and other recent video localization models, can effectively reason about temporal language, we collect the novel TEMPOral reasoning in video and language (TEMPO) dataset.

Natural Language Queries Retrieval +1

Paper
Code

Object Hallucination in Image Captioning

1 code implementation • EMNLP 2018 • Anna Rohrbach, Lisa Anne Hendricks, Kaylee Burns, Trevor Darrell, Kate Saenko

Despite continuously improving performance, contemporary image captioning models are prone to "hallucinating" objects that are not actually in a scene.

Hallucination Image Captioning +2

Paper
Code

Uncertainty-guided Lifelong Learning in Bayesian Networks

no code implementations • 27 Sep 2018 • Sayna Ebrahimi, Mohamed Elhoseiny, Trevor Darrell, Marcus Rohrbach

Sequentially learning of tasks arriving in a continuous stream is a complex problem and becomes more challenging when the model has a fixed capacity.

Continual Learning

Paper
Add Code

Rethinking the Value of Network Pruning

2 code implementations • ICLR 2019 • Zhuang Liu, Ming-Jie Sun, Tinghui Zhou, Gao Huang, Trevor Darrell

Our observations are consistent for multiple network architectures, datasets, and tasks, which imply that: 1) training a large, over-parameterized model is often not necessary to obtain an efficient final model, 2) learned "important" weights of the large model are typically not useful for the small pruned model, 3) the pruned architecture itself, rather than a set of inherited "important" weights, is more crucial to the efficiency in the final model, which suggests that in some cases pruning can be useful as an architecture search paradigm.

Network Pruning Neural Architecture Search

1,496

Paper
Code

Discriminator Rejection Sampling

1 code implementation • ICLR 2019 • Samaneh Azadi, Catherine Olsson, Trevor Darrell, Ian Goodfellow, Augustus Odena

We propose a rejection sampling scheme using the discriminator of a GAN to approximately correct errors in the GAN generator distribution.

Image Generation

Paper
Code

Modular Architecture for StarCraft II with Deep Reinforcement Learning

no code implementations • 8 Nov 2018 • Dennis Lee, Haoran Tang, Jeffrey O. Zhang, Huazhe Xu, Trevor Darrell, Pieter Abbeel

We present a novel modular architecture for StarCraft II AI.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

Deep Object-Centric Policies for Autonomous Driving

no code implementations • 13 Nov 2018 • Dequan Wang, Coline Devin, Qi-Zhi Cai, Fisher Yu, Trevor Darrell

While learning visuomotor skills in an end-to-end manner is appealing, deep neural networks are often uninterpretable and fail in surprising ways.

Autonomous Driving Object

Paper
Add Code

Joint Monocular 3D Vehicle Detection and Tracking

1 code implementation • ICCV 2019 • Hou-Ning Hu, Qi-Zhi Cai, Dequan Wang, Ji Lin, Min Sun, Philipp Krähenbühl, Trevor Darrell, Fisher Yu

The framework can not only associate detections of vehicles in motion over time, but also estimate their complete 3D bounding box information from a sequence of 2D images captured on a moving platform.

Ranked #12 on Multiple Object Tracking on KITTI Tracking test

3D Object Detection 3D Pose Estimation +4

652

Paper
Code

Disentangling Propagation and Generation for Video Prediction

1 code implementation • ICCV 2019 • Hang Gao, Huazhe Xu, Qi-Zhi Cai, Ruth Wang, Fisher Yu, Trevor Darrell

A dynamic scene has two types of elements: those that move fluidly and can be predicted from previous frames, and those which are disoccluded (exposed) and cannot be extrapolated.

Predict Future Video Frames

Paper
Code

SPLAT: Semantic Pixel-Level Adaptation Transforms for Detection

no code implementations • 3 Dec 2018 • Eric Tzeng, Kaylee Burns, Kate Saenko, Trevor Darrell

Without dense labels, as is the case when only detection labels are available in the source, transformations are learned using CycleGAN alignment.

Domain Adaptation Pseudo Label +1

Paper
Add Code

Spatio-Temporal Action Graph Networks

1 code implementation • 4 Dec 2018 • Roei Herzig, Elad Levi, Huijuan Xu, Hang Gao, Eli Brosh, Xiaolong Wang, Amir Globerson, Trevor Darrell

Events defined by the interaction of objects in a scene are often of critical importance; yet important events may have insufficient labeled examples to train a conventional deep model to generalize to future object appearance.

Activity Recognition Autonomous Driving +3

Paper
Code

Few-shot Object Detection via Feature Reweighting

4 code implementations • ICCV 2019 • Bingyi Kang, Zhuang Liu, Xin Wang, Fisher Yu, Jiashi Feng, Trevor Darrell

The feature learner extracts meta features that are generalizable to detect novel object classes, using training data from base classes with sufficient samples.

Ranked #21 on Few-Shot Object Detection on MS-COCO (30-shot)

Few-Shot Learning Few-Shot Object Detection +3

521

Paper
Code

Generalized Zero- and Few-Shot Learning via Aligned Variational Autoencoders

2 code implementations • 5 Dec 2018 • Edgar Schönfeld, Sayna Ebrahimi, Samarth Sinha, Trevor Darrell, Zeynep Akata

Many approaches in generalized zero-shot learning rely on cross-modal mapping between the image feature space and the class embedding space.

Ranked #2 on Generalized Few-Shot Learning on AwA2

Few-Shot Learning Generalized Few-Shot Learning +1

281

Paper
Code

Adversarial Inference for Multi-Sentence Video Description

1 code implementation • CVPR 2019 • Jae Sung Park, Marcus Rohrbach, Trevor Darrell, Anna Rohrbach

Among the main issues are the fluency and coherence of the generated descriptions, and their relevance to the video.

Image Captioning Sentence +1

Paper
Code

Hierarchical Discrete Distribution Decomposition for Match Density Estimation

2 code implementations • CVPR 2019 • Zhichao Yin, Trevor Darrell, Fisher Yu

Explicit representations of the global match distributions of pixel-wise correspondences between pairs of images are desirable for uncertainty estimation and downstream applications.

Ranked #13 on Optical Flow Estimation on KITTI 2015 (train)

Density Estimation Optical Flow Estimation +2

202

Paper
Code

Similarity R-C3D for Few-shot Temporal Activity Detection

no code implementations • 25 Dec 2018 • Huijuan Xu, Bingyi Kang, Ximeng Sun, Jiashi Feng, Kate Saenko, Trevor Darrell

In this paper, we present a conceptually simple and general yet novel framework for few-shot temporal activity detection which detects the start and end time of the few-shot input activities in an untrimmed video.

Action Detection Activity Detection

Paper
Add Code

Robust Change Captioning

1 code implementation • ICCV 2019 • Dong Huk Park, Trevor Darrell, Anna Rohrbach

We present a novel Dual Dynamic Attention Model (DUDA) to perform robust Change Captioning.

Natural Language Visual Grounding

Paper
Code

Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity

1 code implementation • NeurIPS 2019 • Deepak Pathak, Chris Lu, Trevor Darrell, Phillip Isola, Alexei A. Efros

We evaluate the performance of these dynamic and modular agents in simulated environments.

111

Paper
Code

Efficient Receptive Field Learning by Dynamic Gaussian Structure

no code implementations • ICLR Workshop LLD 2019 • Evan Shelhamer, Dequan Wang, Trevor Darrell

The visual world is vast and varied, but its variations divide into structured and unstructured factors.

Representation Learning

Paper
Add Code

Cross-Linked Variational Autoencoders for Generalized Zero-Shot Learning

no code implementations • ICLR Workshop LLD 2019 • Edgar Schönfeld, Sayna Ebrahimi, Samarth Sinha, Trevor Darrell, Zeynep Akata

While following the same direction, we also take artificial feature generation one step further and propose a model where a shared latent space of image features and class embeddings is learned by aligned variational autoencoders, for the purpose of generating latent features to train a softmax classifier.

Few-Shot Learning Generalized Zero-Shot Learning

Paper
Add Code

Compositional GAN (Extended Abstract): Learning Image-Conditional Binary Composition

no code implementations • ICLR Workshop DeepGenStruct 2019 • Samaneh Azadi, Deepak Pathak, Sayna Ebrahimi, Trevor Darrell

Generative Adversarial Networks (GANs) can produce images of surprising complexity and realism but are generally structured to sample from a single latent source ignoring the explicit spatial interaction between multiple entities that could be present in a scene.

Paper
Add Code

Variational Adversarial Active Learning

6 code implementations • ICCV 2019 • Samarth Sinha, Sayna Ebrahimi, Trevor Darrell

Unlike conventional active learning algorithms, our approach is task agnostic, i. e., it does not depend on the performance of the task for which we are trying to acquire labeled data.

Active Learning Image Classification +1

217

Paper
Code

TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning

1 code implementation • CVPR 2019 • Xin Wang, Fisher Yu, Ruth Wang, Trevor Darrell, Joseph E. Gonzalez

We show that TAFE-Net is highly effective in generalizing to new tasks or concepts and evaluate the TAFE-Net on a range of benchmarks in zero-shot and few-shot learning.

Ranked #1 on Few-Shot Image Classification on aPY - 0-Shot

Attribute Few-Shot Learning +1

Paper
Code

Semi-supervised Domain Adaptation via Minimax Entropy

3 code implementations • ICCV 2019 • Kuniaki Saito, Donghyun Kim, Stan Sclaroff, Trevor Darrell, Kate Saenko

Contemporary domain adaptation methods are very effective at aligning feature distributions of source and target domains without any target supervision.

Domain Adaptation Semi-supervised Domain Adaptation

288

Paper
Code

Blurring the Line Between Structure and Learning to Optimize and Adapt Receptive Fields

no code implementations • 25 Apr 2019 • Evan Shelhamer, Dequan Wang, Trevor Darrell

Adapting receptive fields by dynamic Gaussian structure further improves results, equaling the accuracy of free-form deformation while improving efficiency.

Semantic Segmentation

Paper
Add Code

Meta-Learning to Guide Segmentation

no code implementations • ICLR 2019 • Kate Rakelly*, Evan Shelhamer*, Trevor Darrell, Alexei A. Efros, Sergey Levine

To explore generalization, we analyze guidance as a bridge between different levels of supervision to segment classes as the union of instances.

Meta-Learning Segmentation

Paper
Add Code

Accurate Visual Localization for Automotive Applications

1 code implementation • 1 May 2019 • Eli Brosh, Matan Friedmann, Ilan Kadar, Lev Yitzhak Lavy, Elad Levi, Shmuel Rippa, Yair Lempert, Bruno Fernandez-Ruiz, Roei Herzig, Trevor Darrell

We propose a hybrid coarse-to-fine approach that leverages visual and GPS location cues.

Retrieval Visual Localization

Paper
Code

Language-Conditioned Graph Networks for Relational Reasoning

1 code implementation • ICCV 2019 • Ronghang Hu, Anna Rohrbach, Trevor Darrell, Kate Saenko

E. g., conditioning on the "on" relationship to the plate, the object "mug" gathers messages from the object "plate" to update its representation to "mug on the plate", which can be easily consumed by a simple classifier for answer prediction.

Ranked #3 on Referring Expression Comprehension on CLEVR-Ref+

Object Referring Expression Comprehension +2

Paper
Code

Monocular Plan View Networks for Autonomous Driving

no code implementations • 16 May 2019 • Dequan Wang, Coline Devin, Qi-Zhi Cai, Philipp Krähenbühl, Trevor Darrell

Convolutions on monocular dash cam videos capture spatial invariances in the image plane but do not explicitly reason about distances and depth.

3D Object Detection Autonomous Driving +1

Paper
Add Code

Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation

no code implementations • ACL 2019 • Ronghang Hu, Daniel Fried, Anna Rohrbach, Dan Klein, Trevor Darrell, Kate Saenko

The actual grounding can connect language to the environment through multiple modalities, e. g. "stop at the door" might ground into visual objects, while "turn right" might rely only on the geometric structure of a route.

Vision and Language Navigation

Paper
Add Code

Uncertainty-guided Continual Learning with Bayesian Neural Networks

2 code implementations • ICLR 2020 • Sayna Ebrahimi, Mohamed Elhoseiny, Trevor Darrell, Marcus Rohrbach

Continual learning aims to learn new tasks without forgetting previously learned ones.

Continual Learning

Paper
Code

Task-Aware Feature Generation for Zero-Shot Compositional Learning

1 code implementation • 11 Jun 2019 • Xin Wang, Fisher Yu, Trevor Darrell, Joseph E. Gonzalez

In this work, we propose a task-aware feature generation (TFG) framework for compositional learning, which generates features of novel visual concepts by transferring knowledge from previously seen concepts.

Novel Concepts Zero-Shot Learning

Paper
Code

Dynamic Scale Inference by Entropy Minimization

no code implementations • 8 Aug 2019 • Dequan Wang, Evan Shelhamer, Bruno Olshausen, Trevor Darrell

Given the variety of the visual world there is not one true scale for recognition: objects may appear at drastically different sizes across the visual field.

Semantic Segmentation

Paper
Add Code

Weakly-Supervised Trajectory Segmentation for Learning Reusable Skills

no code implementations • 25 Sep 2019 • Parsa Mahmoudieh, Trevor Darrell, Deepak Pathak

Instead of direct manual supervision which is tedious and prone to bias, in this work, our goal is to extract reusable skills from a collection of human demonstrations collected directly for several end-tasks.

Multiple Instance Learning Segmentation

Paper
Add Code

Blurring Structure and Learning to Optimize and Adapt Receptive Fields

no code implementations • 25 Sep 2019 • Evan Shelhamer, Dequan Wang, Trevor Darrell

Adapting receptive fields by dynamic Gaussian structure further improves results, equaling the accuracy of free-form deformation while improving efficiency.

Semantic Segmentation

Paper
Add Code

Composable Semi-parametric Modelling for Long-range Motion Generation

no code implementations • 25 Sep 2019 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Trevor Darrell

Learning diverse and natural behaviors is one of the longstanding goal for creating intelligent characters in the animated world.

Paper
Add Code

Scoring-Aggregating-Planning: Learning task-agnostic priors from interactions and sparse rewards for zero-shot generalization

no code implementations • 25 Sep 2019 • Huazhe Xu, Boyuan Chen, Yang Gao, Trevor Darrell

In this paper, we propose Scoring-Aggregating-Planning (SAP), a framework that can learn task-agnostic semantics and dynamics priors from arbitrary quality interactions as well as the corresponding sparse rewards and then plan on unseen tasks in zero-shot condition.

Zero-shot Generalization

Paper
Add Code

Unsupervised Domain Adaptation through Self-Supervision

3 code implementations • 26 Sep 2019 • Yu Sun, Eric Tzeng, Trevor Darrell, Alexei A. Efros

This paper addresses unsupervised domain adaptation, the setting where labeled training data is available on a source domain, but the goal is to have good performance on a target domain with only unlabeled data.

Ranked #63 on Synthetic-to-Real Translation on GTAV-to-Cityscapes Labels

Unsupervised Domain Adaptation

Paper
Code

Zero-shot Policy Learning with Spatial Temporal RewardDecomposition on Contingency-aware Observation

1 code implementation • 17 Oct 2019 • Huazhe Xu, Boyuan Chen, Yang Gao, Trevor Darrell

The agent is first presented with previous experiences in the training environment, along with task description in the form of trajectory-level sparse rewards.

Continuous Control Model Predictive Control +2

Paper
Code

Regularization Matters in Policy Optimization

2 code implementations • 21 Oct 2019 • Zhuang Liu, Xuanlin Li, Bingyi Kang, Trevor Darrell

In this work, we present the first comprehensive study of regularization techniques with multiple policy optimization algorithms on continuous control tasks.

Continuous Control Reinforcement Learning (RL)

Paper
Code

Exploring Simple and Transferable Recognition-Aware Image Processing

1 code implementation • 21 Oct 2019 • Zhuang Liu, Hung-Ju Wang, Tinghui Zhou, Zhiqiang Shen, Bingyi Kang, Evan Shelhamer, Trevor Darrell

Interestingly, the processing model's ability to enhance recognition quality can transfer when evaluated on models of different architectures, recognized categories, tasks and training datasets.

Image Retrieval Recommendation Systems

Paper
Code

Plan Arithmetic: Compositional Plan Vectors for Multi-Task Control

no code implementations • 30 Oct 2019 • Coline Devin, Daniel Geng, Pieter Abbeel, Trevor Darrell, Sergey Levine

We show that CPVs can be learned within a one-shot imitation learning framework without any additional supervision or information about task hierarchy, and enable a demonstration-conditioned policy to generalize to tasks that sequence twice as many skills as the tasks seen during training.

Imitation Learning

Paper
Add Code

Iterative Answer Prediction with Pointer-Augmented Multimodal Transformers for TextVQA

1 code implementation • CVPR 2020 • Ronghang Hu, Amanpreet Singh, Trevor Darrell, Marcus Rohrbach

Recent work has explored the TextVQA task that requires reading and understanding text in images to answer a question.

General Classification

Paper
Code

Semantic Bottleneck Scene Generation

2 code implementations • 26 Nov 2019 • Samaneh Azadi, Michael Tschannen, Eric Tzeng, Sylvain Gelly, Trevor Darrell, Mario Lucic

For the former, we use an unconditional progressive segmentation generation network that captures the distribution of realistic semantic scene layouts.

Ranked #1 on Image Generation on Cityscapes-5K 256x512

Conditional Image Generation Image-to-Image Translation +2

Paper
Code

Compositional Plan Vectors

1 code implementation • NeurIPS 2019 • Coline Devin, Daniel Geng, Pieter Abbeel, Trevor Darrell, Sergey Levine

Imitation Learning

Paper
Code

Learning Canonical Representations for Scene Graph to Image Generation

2 code implementations • ECCV 2020 • Roei Herzig, Amir Bar, Huijuan Xu, Gal Chechik, Trevor Darrell, Amir Globerson

Generating realistic images of complex visual scenes becomes challenging when one wishes to control the structure of the generated images.

Ranked #3 on Layout-to-Image Generation on Visual Genome 256x256

Layout-to-Image Generation Scene Generation

Paper
Code

Something-Else: Compositional Action Recognition with Spatial-Temporal Interaction Networks

1 code implementation • CVPR 2020 • Joanna Materzynska, Tete Xiao, Roei Herzig, Huijuan Xu, Xiaolong Wang, Trevor Darrell

Human action is naturally compositional: humans can easily recognize and perform actions with objects that are different from those used in training demonstrations.

Action Recognition Object

137

Paper
Code

Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning

1 code implementation • 23 Dec 2019 • Richard Li, Allan Jabri, Trevor Darrell, Pulkit Agrawal

Learning robotic manipulation tasks using reinforcement learning with sparse rewards is currently impractical due to the outrageous data requirements.

Object reinforcement-learning +2

Paper
Code

Meta-Baseline: Exploring Simple Meta-Learning for Few-Shot Learning

10 code implementations • ICCV 2021 • Yinbo Chen, Zhuang Liu, Huijuan Xu, Trevor Darrell, Xiaolong Wang

The edge between these two lines of works has yet been underexplored, and the effectiveness of meta-learning in few-shot learning remains unclear.

Few-Shot Learning General Classification

587

Paper
Code

Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation Learning

3 code implementations • 11 Mar 2020 • Zhiqiang Shen, Zechun Liu, Zhuang Liu, Marios Savvides, Trevor Darrell, Eric Xing

This drawback hinders the model from learning subtle variance and fine-grained information.

Representation Learning

148

Paper
Code

Frustratingly Simple Few-Shot Object Detection

4 code implementations • ICML 2020 • Xin Wang, Thomas E. Huang, Trevor Darrell, Joseph E. Gonzalez, Fisher Yu

Such a simple approach outperforms the meta-learning methods by roughly 2~20 points on current benchmarks and sometimes even doubles the accuracy of the prior methods.

Ranked #17 on Few-Shot Object Detection on MS-COCO (30-shot)

Few-Shot Object Detection Meta-Learning +2

1,059

Paper
Code

Adversarial Continual Learning

1 code implementation • ECCV 2020 • Sayna Ebrahimi, Franziska Meier, Roberto Calandra, Trevor Darrell, Marcus Rohrbach

We show that shared features are significantly less prone to forgetting and propose a novel hybrid continual learning framework that learns a disjoint representation for task-invariant and task-specific features required to solve a sequence of tasks.

Continual Learning Image Classification

251

Paper
Code

Revisiting Few-shot Activity Detection with Class Similarity Control

no code implementations • 31 Mar 2020 • Huijuan Xu, Ximeng Sun, Eric Tzeng, Abir Das, Kate Saenko, Trevor Darrell

In this paper, we present a conceptually simple and general yet novel framework for few-shot temporal activity detection based on proposal regression which detects the start and end time of the activities in untrimmed videos.

Action Detection Activity Detection +1

Paper
Add Code

Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning

1 code implementation • ECCV 2020 • Zhekun Luo, Devin Guillory, Baifeng Shi, Wei Ke, Fang Wan, Trevor Darrell, Huijuan Xu

Weakly-supervised action localization requires training a model to localize the action segments in the video given only video level action label.

Ranked #9 on Weakly Supervised Action Localization on THUMOS’14

Action Localization Multiple Instance Learning +2

Paper
Code

Spatio-Temporal Action Detection with Multi-Object Interaction

no code implementations • 1 Apr 2020 • Huijuan Xu, Lizhi Yang, Stan Sclaroff, Kate Saenko, Trevor Darrell

Spatio-temporal action detection in videos requires localizing the action both spatially and temporally in the form of an "action tube".

Action Detection Human Detection +2

Paper
Add Code

Contrastive Examples for Addressing the Tyranny of the Majority

no code implementations • 14 Apr 2020 • Viktoriia Sharmanska, Lisa Anne Hendricks, Trevor Darrell, Novi Quadrianto

Computer vision algorithms, e. g. for face recognition, favour groups of individuals that are better represented in the training data.

Face Recognition

Paper
Add Code

ParkPredict: Motion and Intent Prediction of Vehicles in Parking Lots

no code implementations • 21 Apr 2020 • Xu Shen, Ivo Batkovic, Vijay Govindarajan, Paolo Falcone, Trevor Darrell, Francesco Borrelli

We investigate the problem of predicting driver behavior in parking lots, an environment which is less structured than typical road networks and features complex, interactive maneuvers in a compact space.

Paper
Add Code

Rethinking preventing class-collapsing in metric learning with margin-based losses

no code implementations • ICCV 2021 • Elad Levi, Tete Xiao, Xiaolong Wang, Trevor Darrell

We theoretically prove and empirically show that under reasonable noise assumptions, margin-based losses tend to project all samples of a class with various modes onto a single point in the embedding space, resulting in a class collapse that usually renders the space ill-sorted for classification or retrieval.

Image Retrieval Metric Learning +1

Paper
Add Code

Quasi-Dense Similarity Learning for Multiple Object Tracking

3 code implementations • CVPR 2021 • Jiangmiao Pang, Linlu Qiu, Xia Li, Haofeng Chen, Qi Li, Trevor Darrell, Fisher Yu

Compared to methods with similar detectors, it boosts almost 10 points of MOTA and significantly decreases the number of ID switches on BDD100K and Waymo datasets.

Ranked #1 on One-Shot Object Detection on PASCAL VOC 2012 val

Contrastive Learning Metric Learning +4

378

Paper
Code

Tent: Fully Test-time Adaptation by Entropy Minimization

2 code implementations • ICLR 2021 • Dequan Wang, Evan Shelhamer, Shaoteng Liu, Bruno Olshausen, Trevor Darrell

A model must adapt itself to generalize to new and different data during testing.

General Classification Image Classification +3

320

Paper
Code

Compositional Video Synthesis with Action Graphs

1 code implementation • 27 Jun 2020 • Amir Bar, Roei Herzig, Xiaolong Wang, Anna Rohrbach, Gal Chechik, Trevor Darrell, Amir Globerson

Our generative model for this task (AG2Vid) disentangles motion and appearance features, and by incorporating a scheduling mechanism for actions facilitates a timely and coordinated video generation.

Scheduling Video Generation +2

Paper
Code

Video Prediction via Example Guidance

1 code implementation • ICML 2020 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Trevor Darrell

In video prediction tasks, one major challenge is to capture the multi-modal nature of future contents and dynamics.

Video Prediction

Paper
Code

Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation

no code implementations • ECCV 2020 • Medhini Narasimhan, Erik Wijmans, Xinlei Chen, Trevor Darrell, Dhruv Batra, Devi Parikh, Amanpreet Singh

We also demonstrate that reducing the task of room navigation to point navigation improves the performance further.

Navigate

Paper
Add Code

Body2Hands: Learning to Infer 3D Hands from Conversational Gesture Body Dynamics

1 code implementation • CVPR 2021 • Evonne Ng, Shiry Ginosar, Trevor Darrell, Hanbyul Joo

We demonstrate the efficacy of our method on hand gesture synthesis from body motion input, and as a strong body prior for single-view image-based 3D hand pose estimation.

3D Hand Pose Estimation

104

Paper
Code

What Should Not Be Contrastive in Contrastive Learning

no code implementations • ICLR 2021 • Tete Xiao, Xiaolong Wang, Alexei A. Efros, Trevor Darrell

Recent self-supervised contrastive methods have been able to produce impressive transferable visual representations by learning to be invariant to different data augmentations.

Contrastive Learning

Paper
Add Code

Identity-Aware Multi-Sentence Video Description

1 code implementation • ECCV 2020 • Jae Sung Park, Trevor Darrell, Anna Rohrbach

This auxiliary task allows us to propose a two-stage approach to Identity-Aware Video Description.

Gender Prediction Sentence +1

Paper
Code

Hierarchical Style-based Networks for Motion Synthesis

no code implementations • ECCV 2020 • Jingwei Xu, Huazhe Xu, Bingbing Ni, Xiaokang Yang, Xiaolong Wang, Trevor Darrell

Generating diverse and natural human motion is one of the long-standing goals for creating intelligent characters in the animated world.

Motion Synthesis

Paper
Add Code

ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework for LiDAR Point Cloud Segmentation

no code implementations • 7 Sep 2020 • Sicheng Zhao, Yezhen Wang, Bo Li, Bichen Wu, Yang Gao, Pengfei Xu, Trevor Darrell, Kurt Keutzer

They require prior knowledge of real-world statistics and ignore the pixel-level dropout noise gap and the spatial feature gap between different domains.

Autonomous Driving Domain Adaptation +3

Paper
Add Code

SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning

1 code implementation • CVPR 2021 • Colorado J Reed, Sean Metzger, Aravind Srinivas, Trevor Darrell, Kurt Keutzer

A common practice in unsupervised representation learning is to use labeled data to evaluate the quality of the learned representations.

Data Augmentation Representation Learning +1

Paper
Code

Reducing Class Collapse in Metric Learning with Easy Positive Sampling

no code implementations • 28 Sep 2020 • Elad Levi, Tete Xiao, Xiaolong Wang, Trevor Darrell

We theoretically prove and empirically show that under reasonable noise assumptions, prevalent embedding losses in metric learning, e. g., triplet loss, tend to project all samples of a class with various modes onto a single point in the embedding space, resulting in a class collapse that usually renders the space ill-sorted for classification or retrieval.

Image Retrieval Metric Learning +1

Paper
Add Code

Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting

1 code implementation • ICLR 2021 • Sayna Ebrahimi, Suzanne Petryk, Akash Gokul, William Gan, Joseph E. Gonzalez, Marcus Rohrbach, Trevor Darrell

The goal of continual learning (CL) is to learn a sequence of tasks without suffering from the phenomenon of catastrophic forgetting.

Continual Learning

Paper
Code

Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation

no code implementations • CVPR 2021 • Bo Li, Yezhen Wang, Shanghang Zhang, Dongsheng Li, Trevor Darrell, Kurt Keutzer, Han Zhao

First, we provide a finite sample bound for both classification and regression problems under Semi-DA.

regression Semi-supervised Domain Adaptation +2

Paper
Add Code

Auxiliary Task Reweighting for Minimum-data Learning

no code implementations • NeurIPS 2020 • Baifeng Shi, Judy Hoffman, Kate Saenko, Trevor Darrell, Huijuan Xu

By adjusting the auxiliary task weights to minimize the divergence between the surrogate prior and the true prior of the main task, we obtain a more accurate prior estimation, achieving the goal of minimizing the required amount of training data for the main task and avoiding a costly grid search.

Domain Adaptation Multi-Label Classification

Paper
Add Code

Modular Networks for Compositional Instruction Following

no code implementations • NAACL 2021 • Rodolfo Corona, Daniel Fried, Coline Devin, Dan Klein, Trevor Darrell

In our approach, subgoal modules each carry out natural language instructions for a specific subgoal type.

Instruction Following

Paper
Add Code

Fighting Copycat Agents in Behavioral Cloning from Observation Histories

no code implementations • NeurIPS 2020 • Chuan Wen, Jierui Lin, Trevor Darrell, Dinesh Jayaraman, Yang Gao

Imitation learning trains policies to map from input observations to the actions that an expert would choose.

Imitation Learning

Paper
Add Code

Temporal Action Detection with Multi-level Supervision

no code implementations • ICCV 2021 • Baifeng Shi, Qi Dai, Judy Hoffman, Kate Saenko, Trevor Darrell, Huijuan Xu

We extensively benchmark against the baselines for SSAD and OSAD on our created data splits in THUMOS14 and ActivityNet1. 2, and demonstrate the effectiveness of the proposed UFA and IB methods.

Action Detection Semi-Supervised Action Detection

Paper
Add Code

Minimax Active Learning

no code implementations • 18 Dec 2020 • Sayna Ebrahimi, William Gan, Dian Chen, Giscard Biamby, Kamyar Salahi, Michael Laielli, Shizhan Zhu, Trevor Darrell

Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.

Active Learning Clustering +2

Paper
Add Code

Contrastive Video Textures

no code implementations • 1 Jan 2021 • Medhini Narasimhan, Shiry Ginosar, Andrew Owens, Alexei A Efros, Trevor Darrell

By randomly traversing edges with high transition probabilities, we generate diverse temporally smooth videos with novel sequences and transitions.

Contrastive Learning Video Generation

Paper
Add Code

Regularization Matters in Policy Optimization - An Empirical Study on Continuous Control

1 code implementation • ICLR 2021 • Zhuang Liu, Xuanlin Li, Bingyi Kang, Trevor Darrell

In this work, we present the first comprehensive study of regularization techniques with multiple policy optimization algorithms on continuous control tasks.

Continuous Control

Paper
Code

Novelty Detection with Rotated Contrastive Predictive Coding

no code implementations • 1 Jan 2021 • Dong Huk Park, Trevor Darrell

To this end, reconstruction-based learning is often used in which the normality of an observation is expressed in how well it can be reconstructed.

Contrastive Learning Novelty Detection

Paper
Add Code

Discovering Autoregressive Orderings with Variational Inference

1 code implementation • ICLR 2021 • Xuanlin Li, Brandon Trabucco, Dong Huk Park, Michael Luo, Sheng Shen, Trevor Darrell, Yang Gao

One strategy to recover this information is to decode both the content and location of tokens.

Code Generation Image Captioning +2

Paper
Code

Unconditional Synthesis of Complex Scenes Using a Semantic Bottleneck

no code implementations • 1 Jan 2021 • Samaneh Azadi, Michael Tschannen, Eric Tzeng, Sylvain Gelly, Trevor Darrell, Mario Lucic

Coupling the high-fidelity generation capabilities of label-conditional image synthesis methods with the flexibility of unconditional generative models, we propose a semantic bottleneck GAN model for unconditional synthesis of complex scenes.

Image Generation Segmentation

Paper
Add Code

Instance-Aware Predictive Navigation in Multi-Agent Environments

1 code implementation • 14 Jan 2021 • Jinkun Cao, Xin Wang, Trevor Darrell, Fisher Yu

To decide the action at each step, we seek the action sequence that can lead to safe future states based on the prediction module outputs by repeatedly sampling likely action sequences.

Paper
Code

Monocular Quasi-Dense 3D Object Tracking

1 code implementation • 12 Mar 2021 • Hou-Ning Hu, Yung-Hsu Yang, Tobias Fischer, Trevor Darrell, Fisher Yu, Min Sun

Experiments on our proposed simulation data and real-world benchmarks, including KITTI, nuScenes, and Waymo datasets, show that our tracking framework offers robust object association and tracking on urban-driving scenarios.

Ranked #7 on Multiple Object Tracking on KITTI Tracking test

3D Object Tracking Autonomous Driving +3

505

Paper
Code

Self-Supervised Pretraining Improves Self-Supervised Pretraining

1 code implementation • 23 Mar 2021 • Colorado J. Reed, Xiangyu Yue, Ani Nrusimha, Sayna Ebrahimi, Vivek Vijaykumar, Richard Mao, Bo Li, Shanghang Zhang, Devin Guillory, Sean Metzger, Kurt Keutzer, Trevor Darrell

Through experimentation on 16 diverse vision datasets, we show HPT converges up to 80x faster, improves accuracy across tasks, and improves the robustness of the self-supervised pretraining process to changes in the image augmentation policy or amount of pretraining data.

Image Augmentation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.