Search Results for author: Juergen Gall

Found 90 papers, 37 papers with code

Ranking Info Noise Contrastive Estimation: Boosting Contrastive Learning via Ranked Positives

1 code implementation27 Jan 2022 David T. Hoffmann, Nadine Behrmann, Juergen Gall, Thomas Brox, Mehdi Noroozi

This paper introduces Ranking Info Noise Contrastive Estimation (RINCE), a new member in the family of InfoNCE losses that preserves a ranked ordering of positive samples.

Contrastive Learning Out-of-Distribution Detection +1

Adaptive Inverse Transform Sampling For Efficient Vision Transformers

no code implementations30 Nov 2021 Mohsen Fayyaz, Soroush Abbasi Koohpayegani, Farnoush Rezaei Jafari, Sunando Sengupta, Hamid Reza Vaezi Joze, Eric Sommerlade, Hamed Pirsiavash, Juergen Gall

Our proposed module improves the SOTA by reducing the computational cost (GFLOPs) by 2x while preserving the accuracy of SOTA models on ImageNet, Kinetics-400, and Kinetics-600 datasets.

Image Classification Video Classification

Keypoint Message Passing for Video-based Person Re-Identification

no code implementations16 Nov 2021 Di Chen, Andreas Doering, Shanshan Zhang, Jian Yang, Juergen Gall, Bernt Schiele

Video-based person re-identification (re-ID) is an important technique in visual surveillance systems which aims to match video snippets of people captured by different cameras.

Representation Learning Video-Based Person Re-Identification

Taylor Swift: Taylor Driven Temporal Modeling for Swift Future Frame Prediction

no code implementations27 Oct 2021 Mohammad Saber Pourheydari, Mohsen Fayyaz, Emad Bahrami, Mehdi Noroozi, Juergen Gall

While recurrent neural networks (RNNs) demonstrate outstanding capabilities in future video frame prediction, they model dynamics in a discrete time space and sequentially go through all frames until the desired future temporal step is reached.

Frame

Long Short View Feature Decomposition via Contrastive Video Representation Learning

no code implementations ICCV 2021 Nadine Behrmann, Mohsen Fayyaz, Juergen Gall, Mehdi Noroozi

We argue that a single representation to capture both types of features is sub-optimal, and propose to decompose the representation space into stationary and non-stationary features via contrastive learning from long and short views, i. e. long video sequences and their shorter sub-sequences.

Action Recognition Action Segmentation +2

FIFA: Fast Inference Approximation for Action Segmentation

no code implementations9 Aug 2021 Yaser Souri, Yazan Abu Farha, Fabien Despinoy, Gianpiero Francesca, Juergen Gall

We apply FIFA on top of state-of-the-art approaches for weakly supervised action segmentation and alignment as well as fully supervised action segmentation.

Weakly Supervised Action Segmentation (Transcript)

Using Visual Anomaly Detection for Task Execution Monitoring

1 code implementation29 Jul 2021 Santosh Thoduka, Juergen Gall, Paul G. Plöger

Our method learns to predict the motions that occur during the nominal execution of a task, including camera and robot body motion.

Anomaly Detection Optical Flow Estimation

Towards Better Adversarial Synthesis of Human Images from Text

no code implementations5 Jul 2021 Rania Briq, Pratika Kochar, Juergen Gall

This paper proposes an approach that generates multiple 3D human meshes from text.

Image Generation

Learning to Generate Novel Scene Compositions from Single Images and Videos

1 code implementation12 May 2021 Vadim Sushko, Juergen Gall, Anna Khoreva

Training GANs in low-data regimes remains a challenge, as overfitting often leads to memorization or training divergence.

Generating Novel Scene Compositions from Single Images and Videos

1 code implementation24 Mar 2021 Vadim Sushko, Dan Zhang, Juergen Gall, Anna Khoreva

We show that SIV-GAN successfully deals with a new challenging task of learning from a single video, for which prior GAN models fail to achieve synthesis of both high quality and diversity.

Image Generation

Temporal Action Segmentation from Timestamp Supervision

1 code implementation CVPR 2021 Zhe Li, Yazan Abu Farha, Juergen Gall

To demonstrate the effectiveness of timestamp supervision, we propose an approach to train a segmentation model using only timestamps annotations.

Action Segmentation Frame +1

Iterative Greedy Matching for 3D Human Pose Tracking from Multiple Views

1 code implementation24 Jan 2021 Julian Tanke, Juergen Gall

In this work we propose an approach for estimating 3D human poses of multiple people from a set of calibrated cameras.

3D Human Pose Tracking Multi-Person Pose Estimation

Hierarchical Graph-RNNs for Action Detection of Multiple Activities

no code implementations21 Jan 2021 Sovan Biswas, Yaser Souri, Juergen Gall

In this paper, we propose an approach that spatially localizes the activities in a video frame where each person can perform multiple activities at the same time.

Action Detection Frame

Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting

1 code implementation21 Jan 2021 Sovan Biswas, Juergen Gall

Since computing, the probabilities for the full power set becomes intractable as the number of action classes increases, we assign an action set to each detected person under the constraint that the assignment is consistent with the annotation of the video clip.

Action Detection Multi-Label Learning

Spatial-Temporal Consistency Network for Low-Latency Trajectory Forecasting

no code implementations ICCV 2021 Shijie Li, Yanying Zhou, Jinhui Yi, Juergen Gall

Trajectory forecasting is a crucial step for autonomous vehicles and mobile robots in order to navigate and interact safely.

Autonomous Vehicles Frame +1

You Only Need Adversarial Supervision for Semantic Image Synthesis

1 code implementation ICLR 2021 Vadim Sushko, Edgar Schönfeld, Dan Zhang, Juergen Gall, Bernt Schiele, Anna Khoreva

By providing stronger supervision to the discriminator as well as to the generator through spatially- and semantically-aware discriminator feedback, we are able to synthesize images of higher fidelity with better alignment to their input label maps, making the use of the perceptual loss superfluous.

Image-to-Image Translation Semantic Segmentation

3D CNNs with Adaptive Temporal Feature Resolutions

1 code implementation CVPR 2021 Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc van Gool, Juergen Gall

While the GFLOPs of a 3D CNN can be decreased by reducing the temporal feature resolution within the network, there is no setting that is optimal for all input clips.

Action Recognition

PoseTrackReID: Dataset Description

no code implementations12 Nov 2020 Andreas Doering, Di Chen, Shanshan Zhang, Bernt Schiele, Juergen Gall

For that reason, we present PoseTrackReID, a large-scale dataset for multi-person pose tracking and video-based person re-ID.

Frame Pose Tracking +1

Long-Term Anticipation of Activities with Cycle Consistency

no code implementations2 Sep 2020 Yazan Abu Farha, Qiuhong Ke, Bernt Schiele, Juergen Gall

With the success of deep learning methods in analyzing activities in videos, more attention has recently been focused towards anticipating future activities.

Multi-scale Interaction for Real-time LiDAR Data Segmentation on an Embedded Platform

2 code implementations20 Aug 2020 Shijie Li, Xieyuanli Chen, Yun Liu, Dengxin Dai, Cyrill Stachniss, Juergen Gall

Real-time semantic segmentation of LiDAR data is crucial for autonomously driving vehicles, which are usually equipped with an embedded platform and have limited computational resources.

Autonomous Vehicles Real-Time 3D Semantic Segmentation +1

Audio- and Gaze-driven Facial Animation of Codec Avatars

no code implementations11 Aug 2020 Alexander Richard, Colin Lea, Shugao Ma, Juergen Gall, Fernando de la Torre, Yaser Sheikh

Codec Avatars are a recent class of learned, photorealistic face models that accurately represent the geometry and texture of a person in 3D (i. e., for virtual reality), and are almost indistinguishable from video.

Frame

Rethinking 3D LiDAR Point Cloud Segmentation

1 code implementation10 Aug 2020 Shijie Li, Yun Liu, Juergen Gall

Many point-based semantic segmentation methods have been designed for indoor scenarios, but they struggle if they are applied to point clouds that are captured by a LiDAR sensor in an outdoor environment.

Autonomous Driving Point Cloud Segmentation +1

MS-TCN++: Multi-Stage Temporal Convolutional Network for Action Segmentation

1 code implementation16 Jun 2020 Shijie Li, Yazan Abu Farha, Yun Liu, Ming-Ming Cheng, Juergen Gall

Despite the capabilities of these approaches in capturing temporal dependencies, their predictions suffer from over-segmentation errors.

Action Segmentation

Adversarial Synthesis of Human Pose from Text

no code implementations1 May 2020 Yifei Zhang, Rania Briq, Julian Tanke, Juergen Gall

This work focuses on synthesizing human poses from human-level text descriptions.

Self-supervised Keypoint Correspondences for Multi-Person Pose Estimation and Tracking in Videos

no code implementations ECCV 2020 Umer Rafi, Andreas Doering, Bastian Leibe, Juergen Gall

Instead of training the network for estimating keypoint correspondences on video data, it is trained on a large scale image datasets for human pose estimation using self-supervision.

Frame Multi-Person Pose Estimation +2

Discovering Latent Classes for Semi-Supervised Semantic Segmentation

no code implementations30 Dec 2019 Olga Zatsarynna, Johann Sawatzky, Juergen Gall

On unlabeled images, we predict a probability map for latent classes and use it as a supervision signal to learn semantic segmentation.

Semi-Supervised Semantic Segmentation

Bonn Activity Maps: Dataset Description

no code implementations13 Dec 2019 Julian Tanke, Oh-Hun Kwon, Patrick Stotko, Radu Alexandru Rosu, Michael Weinmann, Hassan Errami, Sven Behnke, Maren Bennewitz, Reinhard Klein, Andreas Weber, Angela Yao, Juergen Gall

The key prerequisite for accessing the huge potential of current machine learning techniques is the availability of large databases that capture the complex relations of interest.

Activity Recognition

Joint Viewpoint and Keypoint Estimation with Real and Synthetic Data

1 code implementation13 Dec 2019 Pau Panareda Busto, Juergen Gall

The estimation of viewpoints and keypoints effectively enhance object detection methods by extracting valuable traits of the object instances.

Object Detection

Human Motion Anticipation with Symbolic Label

no code implementations12 Dec 2019 Julian Tanke, Andreas Weber, Juergen Gall

We exploit this connection by first anticipating symbolic labels and then generate human motion, conditioned on the human motion input sequence as well as on the forecast labels.

Motion Forecasting

Cross-modal knowledge distillation for action recognition

no code implementations10 Oct 2019 Fida Mohammad Thoker, Juergen Gall

To this end, we extract the knowledge of the trained teacher network for the source modality and transfer it to a small ensemble of student networks for the target modality.

Action Recognition Knowledge Distillation

Uncertainty-Aware Anticipation of Activities

no code implementations26 Aug 2019 Yazan Abu Farha, Juergen Gall

Anticipating future activities in video is a task with many practical applications.

Open Set Domain Adaptation for Image and Action Recognition

1 code implementation30 Jul 2019 Pau Panareda Busto, Ahsan Iqbal, Juergen Gall

Since this assumption is violated under real-world conditions, we propose an approach for open set domain adaptation where the target domain contains instances of categories that are not present in the source domain.

Action Recognition Domain Adaptation +1

Mining YouTube - A dataset for learning fine-grained action concepts from webly supervised video data

1 code implementation3 Jun 2019 Hilde Kuehne, Ahsan Iqbal, Alexander Richard, Juergen Gall

Action recognition is so far mainly focusing on the problem of classification of hand selected preclipped actions and reaching impressive results in this field.

Action Recognition General Classification

Harvesting Information from Captions for Weakly Supervised Semantic Segmentation

no code implementations16 May 2019 Johann Sawatzky, Debayan Banerjee, Juergen Gall

They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image.

Image Captioning Weakly-Supervised Semantic Segmentation

3D Semantic Scene Completion from a Single Depth Image using Adversarial Training

no code implementations15 May 2019 Yueh-Tung Chen, Martin Garbade, Juergen Gall

We address the task of 3D semantic scene completion, i. e. , given a single depth image, we predict the semantic labels and occupancy of voxels in a 3D grid representing the scene.

3D Semantic Scene Completion

Unifying Part Detection and Association for Recurrent Multi-Person Pose Estimation

no code implementations26 Apr 2019 Rania Briq, Andreas Doering, Juergen Gall

We propose a joint model of human joint detection and association for 2D multi-person pose estimation (MPPE).

Multi-Person Pose Estimation

Unsupervised learning of action classes with continuous temporal embedding

1 code implementation CVPR 2019 Anna Kukleva, Hilde Kuehne, Fadime Sener, Juergen Gall

The task of temporally detecting and segmenting actions in untrimmed videos has seen an increased attention recently.

MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation

1 code implementation CVPR 2019 Yazan Abu Farha, Juergen Gall

Temporally locating and classifying action segments in long untrimmed videos is of particular interest to many applications like surveillance and robotics.

Action Segmentation Frame

Human Motion Prediction via Spatio-Temporal Inpainting

1 code implementation13 Dec 2018 Alejandro Hernandez Ruiz, Juergen Gall, Francesc Moreno-Noguer

First, we represent the data using a spatio-temporal tensor of 3D skeleton coordinates which allows formulating the prediction problem as an inpainting one, for which GANs work particularly well.

Human motion prediction Motion Forecasting +1

Sequence Prediction using Spectral RNNs

2 code implementations13 Dec 2018 Moritz Wolter, Juergen Gall, Angela Yao

Fourier methods have a long and proven track record as an excellent tool in data processing.

Time Series

Convolutional Simplex Projection Network (CSPN) for Weakly Supervised Semantic Segmentation

1 code implementation24 Jul 2018 Rania Briq, Michael Moeller, Juergen Gall

Weakly supervised semantic segmentation has been a subject of increased interest due to the scarcity of fully annotated images.

Weakly-Supervised Semantic Segmentation

Are good local minima wide in sparse recovery?

no code implementations21 Jun 2018 Michael Moeller, Otmar Loffeld, Juergen Gall, Felix Krahmer

The idea of compressed sensing is to exploit representations in suitable (overcomplete) dictionaries that allow to recover signals far beyond the Nyquist rate provided that they admit a sparse representation in the respective dictionary.

Spatio-Temporal Channel Correlation Networks for Action Classification

no code implementations ECCV 2018 Ali Diba, Mohsen Fayyaz, Vivek Sharma, M. Mahdi Arzani, Rahman Yousefzadeh, Juergen Gall, Luc van Gool

Our experiments show that adding STC blocks to current state-of-the-art architectures outperforms the state-of-the-art methods on the HMDB51, UCF101 and Kinetics datasets.

Action Classification Classification +1

Joint Flow: Temporal Flow Fields for Multi Person Tracking

no code implementations11 May 2018 Andreas Doering, Umar Iqbal, Juergen Gall

The general formulation of our temporal network allows to rely on any multi person pose estimation approach as spatial network.

Frame Multi-Person Pose Estimation +1

Two Stream 3D Semantic Scene Completion

no code implementations10 Apr 2018 Martin Garbade, Yueh-Tung Chen, Johann Sawatzky, Juergen Gall

In this work, we propose a two stream approach that leverages depth information and semantic information, which is inferred from the RGB image, for this task.

3D Semantic Scene Completion

Structural Recurrent Neural Network (SRNN) for Group Activity Analysis

no code implementations6 Feb 2018 Sovan Biswas, Juergen Gall

In this paper, we propose a structural recurrent neural network (SRNN) that uses a series of interconnected RNNs to jointly capture the actions of individuals, their interactions, as well as the group activity.

Material Classification in the Wild: Do Synthesized Training Data Generalise Better than Real-World Training Data?

no code implementations9 Nov 2017 Grigorios Kalliatakis, Anca Sticlaru, George Stamatiadis, Shoaib Ehsan, Ales Leonardis, Juergen Gall, Klaus D. McDonald-Maier

We question the dominant role of real-world training images in the field of material classification by investigating whether synthesized data can generalise more effectively than real-world data.

General Classification Material Classification

PoseTrack: A Benchmark for Human Pose Estimation and Tracking

2 code implementations CVPR 2018 Mykhaylo Andriluka, Umar Iqbal, Eldar Insafutdinov, Leonid Pishchulin, Anton Milan, Juergen Gall, Bernt Schiele

In this work, we aim to further advance the state of the art by establishing "PoseTrack", a new large-scale benchmark for video-based human pose estimation and articulated tracking, and bringing together the community of researchers working on visual human analysis.

Activity Recognition Frame +2

Open Set Domain Adaptation

no code implementations ICCV 2017 Pau Panareda Busto, Juergen Gall

The approach learns a mapping from the source to the target domain by jointly solving an assignment problem that labels those target instances that potentially belong to the categories of interest present in the source dataset.

Domain Adaptation

SurfaceNet: An End-to-end 3D Neural Network for Multiview Stereopsis

3 code implementations ICCV 2017 Mengqi Ji, Juergen Gall, Haitian Zheng, Yebin Liu, Lu Fang

It takes a set of images and their corresponding camera parameters as input and directly infers the 3D model.

Adaptive Binarization for Weakly Supervised Affordance Segmentation

no code implementations10 Jul 2017 Johann Sawatzky, Juergen Gall

The concept of affordance is important to understand the relevance of object parts for a certain functional interaction.

Binarization

Weakly Supervised Affordance Detection

1 code implementation CVPR 2017 Johann Sawatzky, Abhilash Srikantha, Juergen Gall

Localizing functional regions of objects or affordances is an important aspect of scene understanding and relevant for many robotics applications.

Affordance Detection Scene Understanding

Recurrent Residual Learning for Action Recognition

no code implementations27 Jun 2017 Ahsan Iqbal, Alexander Richard, Hilde Kuehne, Juergen Gall

In this work, we propose a novel recurrent ConvNet architecture called recurrent residual networks to address the task of action recognition.

Action Recognition Frame +1

A Dual-Source Approach for 3D Human Pose Estimation from a Single Image

no code implementations8 May 2017 Umar Iqbal, Andreas Doering, Hashim Yasin, Björn Krüger, Andreas Weber, Juergen Gall

To this end, we first convert the motion capture data into a normalized 2D pose space, and separately learn a 2D pose estimation model from the image data.

Monocular 3D Human Pose Estimation Pose Retrieval

3D Object Reconstruction from Hand-Object Interactions

3 code implementations ICCV 2015 Dimitrios Tzionas, Juergen Gall

Recent advances have enabled 3d object reconstruction approaches using a single off-the-shelf RGB-D camera.

3D Object Reconstruction 3D Reconstruction

A Comparison of Directional Distances for Hand Pose Estimation

no code implementations3 Apr 2017 Dimitrios Tzionas, Juergen Gall

Benchmarking methods for 3d hand tracking is still an open problem due to the difficulty of acquiring ground truth data.

Frame Hand Pose Estimation

Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points

2 code implementations3 Apr 2017 Dimitrios Tzionas, Abhilash Srikantha, Pablo Aponte, Juergen Gall

In this work, we propose a framework for hand tracking that can capture the motion of two interacting hands using only a single, inexpensive RGB-D camera.

14 Pose Tracking

A Bag-of-Words Equivalent Recurrent Neural Network for Action Recognition

1 code implementation23 Mar 2017 Alexander Richard, Juergen Gall

In this work, we propose a recurrent neural network that is equivalent to the traditional bag-of-words approach but enables for the application of discriminative training.

Action Recognition General Classification +1

Detection of Human Rights Violations in Images: Can Convolutional Neural Networks help?

no code implementations12 Mar 2017 Grigorios Kalliatakis, Shoaib Ehsan, Maria Fasli, Ales Leonardis, Juergen Gall, Klaus D. McDonald-Maier

We conduct a rigorous evaluation on a common ground by combining this dataset with different state-of-the-art deep convolutional architectures in order to achieve recognition of human rights violations.

PoseTrack: Joint Multi-Person Pose Estimation and Tracking

1 code implementation CVPR 2017 Umar Iqbal, Anton Milan, Juergen Gall

In this work, we introduce the challenging problem of joint multi-person pose estimation and tracking of an unknown number of persons in unconstrained videos.

Multi-Person Pose Estimation Multi-Person Pose Estimation and Tracking +1

Weakly supervised learning of actions from transcripts

no code implementations7 Oct 2016 Hilde Kuehne, Alexander Richard, Juergen Gall

Our system is based on the idea that, given a sequence of input data and a transcript, i. e. a list of the order the actions occur in the video, it is possible to infer the actions within the video stream, and thus, learn the related action models without the need for any frame-based annotation.

Frame

Reconstructing Articulated Rigged Models from RGB-D Videos

no code implementations6 Sep 2016 Dimitrios Tzionas, Juergen Gall

Although commercial and open-source software exist to reconstruct a static object from a sequence recorded with an RGB-D sensor, there is a lack of tools that build rigged models of articulated objects that deform realistically and can be used for tracking or animation.

Motion Segmentation

Temporal Action Detection Using a Statistical Language Model

1 code implementation CVPR 2016 Alexander Richard, Juergen Gall

While current approaches to action recognition on pre-segmented video clips already achieve high accuracies, temporal action detection is still far from comparably good results.

14 Action Detection +2

Weakly Supervised Learning of Affordances

no code implementations10 May 2016 Abhilash Srikantha, Juergen Gall

Localizing functional regions of objects or affordances is an important aspect of scene understanding.

Human-Object Interaction Detection Scene Understanding +1

Pose for Action - Action for Pose

no code implementations13 Mar 2016 Umar Iqbal, Martin Garbade, Juergen Gall

In this work we propose to utilize information about human actions to improve pose estimation in monocular videos.

Action Recognition Pose Estimation +1

Cooking in the kitchen: Recognizing and Segmenting Human Activities in Videos

no code implementations25 Aug 2015 Hilde Kuehne, Juergen Gall, Thomas Serre

Through extensive system evaluations, we demonstrate that combining compact video representations based on Fisher Vectors with HMM-based modeling yields very significant gains in accuracy and when properly trained with sufficient training samples, structured temporal models outperform unstructured bag-of-word types of models by a large margin on the tested performance metric.

Action Recognition

Capturing Hands in Action using Discriminative Salient Points and Physics Simulation

2 code implementations6 Jun 2015 Dimitrios Tzionas, Luca Ballan, Abhilash Srikantha, Pablo Aponte, Marc Pollefeys, Juergen Gall

Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors.

From Categories to Subcategories: Large-Scale Image Classification With Partial Class Label Refinement

no code implementations CVPR 2015 Marko Ristin, Juergen Gall, Matthieu Guillaumin, Luc van Gool

Compared to approaches that disregard the extra coarse labeled data, we achieve a relative improvement in subcategory classification accuracy of up to 22% in our large-scale image classification experiments.

Classification General Classification +1

Human Pose Estimation Using Body Parts Dependent Joint Regressors

no code implementations CVPR 2013 Matthias Dantone, Juergen Gall, Christian Leistner, Luc van Gool

The second layer takes the estimated class distributions of the first one into account and is thereby able to predict joint locations by modeling the interdependence and co-occurrence of the parts.

Pose Estimation

Learning Probabilistic Non-Linear Latent Variable Models for Tracking Complex Activities

no code implementations NeurIPS 2011 Angela Yao, Juergen Gall, Luc V. Gool, Raquel Urtasun

A common approach for handling the complexity and inherent ambiguities of 3D human pose estimation is to use pose priors learned from training data.

3D Human Pose Estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.