no code implementations • 13 Oct 2021 • Matthias Tangemann, Steffen Schneider, Julius von Kügelgen, Francesco Locatello, Peter Gehler, Thomas Brox, Matthias Kümmerer, Matthias Bethge, Bernhard Schölkopf
Learning generative object models from unlabelled videos is a long standing problem and required for causal scene modeling.
Modern neural network architectures can leverage large amounts of data to generalize well within the training distribution.
Applying this procedure to state-of-the-art trajectory prediction methods on standard benchmark datasets shows that they are, in fact, unable to reason about interactions.
Contrastive learning allows us to flexibly define powerful losses by contrasting positive pairs from sets of negative samples.
An important component for generalization in machine learning is to uncover underlying latent factors of variation as well as the mechanism through which each factor acts in the world.
By training 240 representations and over 10, 000 reinforcement learning (RL) policies on a simulated robotic setup, we evaluate to what extent different properties of pretrained VAE-based representations affect the OOD generalization of downstream agents.
; and (ii) if the new predictions differ from the current ones, should we update?
Being able to spot defective parts is a critical component in large-scale industrial manufacturing.
Ranked #1 on Anomaly Detection on MVTec AD (using extra training data)
We therefore re-purpose the dataset from the Visual Domain Adaptation Challenge 2019 and use a subset of it as a new robustness benchmark (ImageNet-D) which proves to be a more challenging dataset for all current state-of-the-art models (58. 2% error) to guide future research efforts at the intersection of robustness and domain adaptation on ImageNet scale.
Ranked #1 on Unsupervised Domain Adaptation on ImageNet-C (using extra training data)
Learning how to model complex scenes in a modular way with recombinable components is a pre-requisite for higher-order reasoning and acting in the physical world.
In a previous work, it was shown that there is a curious problem with the benchmark ColorChecker dataset for illuminant estimation.
However, in challenging imaging conditions such as on low-resolution images or when the image is corrupted by imaging artifacts, current systems degrade considerably in accuracy.
We then fit (top-down) a recently published statistical body shape model, called SMPL, to the 2D joints.
Ranked #26 on 3D Human Pose Estimation on HumanEva-I
This paper considers the task of articulated human pose estimation of multiple people in real world images.
Ranked #2 on Multi-Person Pose Estimation on WAF
Object class detection has been a synonym for 2D bounding box localization for the longest time, fueled by the success of powerful statistical learning techniques, combined with robust image representations.
While the majority of today's object class models provide only 2D bounding boxes, far richer output hypotheses are desirable including viewpoint, fine-grained category, and 3D geometry estimate.
Despite the success of recent object class recognition systems, the long-standing problem of partial occlusion remains a major challenge, and a principled solution is yet to be found.