Most modern approaches for domain adaptive semantic segmentation rely on continued access to source data during adaptation, which may be infeasible due to computational or privacy constraints.
As an attempt towards assessing the robustness of embodied navigation agents, we propose RobustNav, a framework to quantify the performance of embodied navigation agents when exposed to a wide variety of visual - affecting RGB inputs - and dynamics - affecting transition dynamics - corruptions.
Many existing approaches for unsupervised domain adaptation (UDA) focus on adapting under only data distribution shift and offer limited success under additional cross-domain label distribution shift.
We extensively benchmark against the baselines for SSAD and OSAD on our created data splits in THUMOS14 and ActivityNet1. 2, and demonstrate the effectiveness of the proposed UFA and IB methods.
We study the problem of AL under a domain shift, called Active Domain Adaptation (Active DA).
By adjusting the auxiliary task weights to minimize the divergence between the surrogate prior and the true prior of the main task, we obtain a more accurate prior estimation, achieving the goal of minimizing the required amount of training data for the main task and avoiding a costly grid search.
This enables a seamless adaption to changing dynamics (a different robot or floor type) by simply re-calibrating the visual odometry model -- circumventing the expense of re-training of the navigation policy.
Ranked #5 on Robot Navigation on Habitat 2020 Point Nav test-std
For domain generalization, the goal is to learn from a set of source domains to produce a single model that will best generalize to an unseen target domain.
Ranked #2 on Domain Generalization on DomainNet
Convolutional Neural Networks have been shown to be vulnerable to adversarial examples, which are known to locate in subspaces close to where normal data lies but are not naturally occurring and of low probability.
We introduce TIDE, a framework and associated toolbox for analyzing the sources of error in object detection and instance segmentation algorithms.
Recent advances in deep reinforcement learning require a large amount of training data and generally result in representations that are often over specialized to the target task.
We seek to learn a representation on a large annotated data source that generalizes to a target domain using limited new supervision.
Surprisingly, we find that slight differences in task have no measurable effect on the visual representation for both SqueezeNet and ResNet architectures.
Adversarial training is by far the most successful strategy for improving robustness of neural networks to adversarial attacks.
In this work, we investigate whether state-of-the-art object detection systems have equitable predictive performance on pedestrians with different skin tones.
In this paper, we present a new large-scale benchmark called Syn2Real, which consists of a synthetic domain rendered from 3D object models and two real-image domains containing the same object categories.
We propose a framework that learns a representation transferable across different domains and tasks in a data efficient manner.
We propose a framework that learns a representation transferable across different domains and tasks in a label efficient manner.
We present a detailed theoretical analysis of the problem of multiple-source adaptation in the general stochastic scenario, extending known results that assume a single target labeling function.
Domain adaptation is critical for success in new, unseen environments.
We present the 2017 Visual Domain Adaptation (VisDA) dataset and challenge, a large-scale testbed for unsupervised domain adaptation across visual domains.
While fine-grained object recognition is an important problem in computer vision, current models are unlikely to accurately classify objects in the wild.
Existing methods for visual reasoning attempt to directly map inputs to outputs using black-box architectures without explicitly modeling the underlying reasoning processes.
Ranked #5 on Visual Question Answering on CLEVR-Humans
Adversarial learning methods are a promising approach to training robust deep networks, and can generate complex samples across diverse domains.
In this paper, we introduce the first domain adaptive semantic segmentation method, proposing an unsupervised adversarial approach to pixel prediction problems.
Ranked #2 on Image-to-Image Translation on SYNTHIA Fall-to-Winter
Recent years have seen tremendous progress in still-image segmentation; however the na\"ive application of these state-of-the-art algorithms to every video frame requires considerable computation and ignores the temporal continuity inherent in video.
Thus, our method transfers information commonly extracted from depth training data to a network which can extract that information from the RGB counterpart.
We address the difficult problem of distinguishing fine-grained object categories in low resolution images.
We propose a novel, more powerful combination of both distribution and pairwise image alignment, and remove the requirement for expensive annotation by using weakly aligned pairs of images in the source and target domains.
Quantification is the task of estimating the class-distribution of a data-set.
Our approach proves to be especially useful in large scale settings with thousands of classes, where spatial and semantic interactions are very frequent and only weakly supervised detectors can be built due to a lack of bounding box annotations.
Recent reports suggest that a generic supervised deep CNN model trained on a large-scale dataset reduces, but does not remove, dataset bias.
Recent reports suggest that a generic supervised deep CNN model trained on a large-scale dataset reduces, but does not remove, dataset bias on a standard benchmark.
Ranked #6 on Domain Adaptation on Office-Caltech
We develop methods for detector learning which exploit joint training over both weak and strong labels and which transfer learned perceptual representations from strongly-labeled auxiliary tasks.
A major challenge in scaling object detection is the difficulty of obtaining labeled images for large numbers of categories.
The classic domain adaptation paradigm considers the world to be separated into stationary domains with clear boundaries between them.
In other words, are deep CNNs trained on large amounts of labeled data as susceptible to dataset bias as previous methods have been shown to be?
We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks.
Images seen during test time are often not from the same distribution as images used for learning.
Most successful object classification and detection methods rely on classifiers trained on large labeled datasets.
We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers.