Search Results for author: Juergen Gall

Found 112 papers, 49 papers with code

A Multimodal Handover Failure Detection Dataset and Baselines

no code implementations • 28 Feb 2024 • Santosh Thoduka, Nico Hochgeschwender, Juergen Gall, Paul G. Plöger

To address this deficit, we present the multimodal Handover Failure Detection dataset, which consists of failures induced by the human participant, such as ignoring the robot or not releasing the object.

Action Segmentation Object +1

Paper
Add Code

Wavelet Packet Power Spectrum Kullback-Leibler Divergence: A New Metric for Image Synthesis

no code implementations • 23 Dec 2023 • Lokesh Veeramacheneni, Moritz Wolter, Juergen Gall

In response, we propose a new frequency band-based quality metric, which opens a door into the frequency domain yet, at the same time, preserves spatial aspects of the data.

Image Generation

Paper
Add Code

VaLID: Variable-Length Input Diffusion for Novel View Synthesis

no code implementations • 14 Dec 2023 • Shijie Li, Farhad G. Zanjani, Haitam Ben Yahia, Yuki M. Asano, Juergen Gall, Amirhossein Habibian

This is because the source-view images and corresponding poses are processed separately and injected into the model at different stages.

Image Generation Novel View Synthesis +1

Paper
Add Code

DiffAnt: Diffusion Models for Action Anticipation

no code implementations • 27 Nov 2023 • Zeyun Zhong, Chengzhi Wu, Manuel Martin, Michael Voit, Juergen Gall, Jürgen Beyerer

However, the majority of existing action anticipation models adhere to a deterministic approach, neglecting to account for future uncertainties.

Action Anticipation

Paper
Add Code

A Survey on Deep Learning Techniques for Action Anticipation

no code implementations • 29 Sep 2023 • Zeyun Zhong, Manuel Martin, Michael Voit, Juergen Gall, Jürgen Beyerer

The ability to anticipate possible future human actions is essential for a wide range of applications, including autonomous driving and human-robot interaction.

Action Anticipation Autonomous Driving

Paper
Add Code

TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation

no code implementations • 14 Sep 2023 • Rong Li, Shijie Li, Xieyuanli Chen, Teli Ma, Juergen Gall, Junwei Liang

In this paper, we present TFNet, a range-image-based LiDAR semantic segmentation method that utilizes temporal information to address this issue.

Ranked #1 on Semantic Segmentation on SemanticPOSS

Autonomous Driving LIDAR Semantic Segmentation +1

Paper
Add Code

Semantic RGB-D Image Synthesis

no code implementations • 22 Aug 2023 • Shijie Li, Rong Li, Juergen Gall

In this paper, we therefore propose a generator for multi-modal data that separates modal-independent information of the semantic layout from the modal-dependent information that is needed to generate an RGB and a depth image, respectively.

Image Generation Image Segmentation +2

Paper
Add Code

How Much Temporal Long-Term Context is Needed for Action Segmentation?

1 code implementation • ICCV 2023 • Emad Bahrami, Gianpiero Francesca, Juergen Gall

In this work, we try to answer how much long-term temporal context is required for temporal action segmentation by introducing a transformer-based model that leverages sparse attention to capture the full context of a video.

Ranked #1 on Action Segmentation on Assembly101

Action Segmentation Segmentation

Paper
Code

Smoothness Similarity Regularization for Few-Shot GAN Adaptation

no code implementations • ICCV 2023 • Vadim Sushko, Ruyu Wang, Juergen Gall

The task of few-shot GAN adaptation aims to adapt a pre-trained GAN model to a small dataset with very few training images.

Memorization

Paper
Add Code

3DMOTFormer: Graph Transformer for Online 3D Multi-Object Tracking

1 code implementation • ICCV 2023 • Shuxiao Ding, Eike Rehder, Lukas Schneider, Marius Cordts, Juergen Gall

Tracking 3D objects accurately and consistently is crucial for autonomous vehicles, enabling more reliable downstream tasks such as trajectory prediction and motion planning.

3D Multi-Object Tracking Autonomous Vehicles +6

Paper
Code

Action Anticipation with Goal Consistency

1 code implementation • 26 Jun 2023 • Olga Zatsarynna, Juergen Gall

In this paper, we address the problem of short-term action anticipation, i. e., we want to predict an upcoming action one second before it happens.

Ranked #1 on Action Anticipation on Assembly101

Action Anticipation

Paper
Code

PowerBEV: A Powerful Yet Lightweight Framework for Instance Prediction in Bird's-Eye View

1 code implementation • 19 Jun 2023 • Peizheng Li, Shuxiao Ding, Xieyuanli Chen, Niklas Hanselmann, Marius Cordts, Juergen Gall

Accurately perceiving instances and predicting their future motion are key tasks for autonomous vehicles, enabling them to navigate safely in complex urban traffic.

Autonomous Driving motion prediction +1

Paper
Code

A Gated Attention Transformer for Multi-Person Pose Tracking

no code implementations • 9 Jun 2023 • Andreas Doering, Juergen Gall

Multi-person pose tracking is an important element for many applications and requires to estimate the human poses of all persons in a video and to track them over time.

Pose Tracking

Paper
Add Code

Social Diffusion: Long-term Multiple Human Motion Anticipation

1 code implementation • ICCV 2023 • Julian Tanke, Linguang Zhang, Amy Zhao, Chengcheng Tang, Yujun Cai, Lezi Wang, Po-Chen Wu, Juergen Gall, Cem Keskin

We propose Social Diffusion, a novel method for short-term and long-term forecasting of the motion of multiple persons as well as their social interactions.

Paper
Code

Location-aware Adaptive Normalization: A Deep Learning Approach For Wildfire Danger Forecasting

1 code implementation • 16 Dec 2022 • Mohamad Hakam Shams Eddin, Ribana Roscher, Juergen Gall

Climate change is expected to intensify and increase extreme events in the weather cycle.

Paper
Code

Robust Action Segmentation from Timestamp Supervision

no code implementations • 12 Oct 2022 • Yaser Souri, Yazan Abu Farha, Emad Bahrami, Gianpiero Francesca, Juergen Gall

As obtaining annotations to train an approach for action segmentation in a fully supervised way is expensive, various approaches have been proposed to train action segmentation models using different forms of weak supervision, e. g., action transcripts, action sets, or more recently timestamps.

Action Segmentation Segmentation

Paper
Add Code

Dual Pyramid Generative Adversarial Networks for Semantic Image Synthesis

1 code implementation • 8 Oct 2022 • Shijie Li, Ming-Ming Cheng, Juergen Gall

The goal of semantic image synthesis is to generate photo-realistic images from semantic label maps.

Ranked #1 on Image-to-Image Translation on ADE20K-Outdoor Labels-to-Photos

Generative Adversarial Network Image-to-Image Translation

Paper
Code

Self-supervised Learning for Unintentional Action Prediction

no code implementations • 24 Sep 2022 • Olga Zatsarynna, Yazan Abu Farha, Juergen Gall

Distinguishing if an action is performed as intended or if an intended action fails is an important skill that not only humans have, but that is also important for intelligent systems that operate in human environments.

Action Classification Representation Learning +1

Paper
Add Code

One-Shot Synthesis of Images and Segmentation Masks

1 code implementation • 15 Sep 2022 • Vadim Sushko, Dan Zhang, Juergen Gall, Anna Khoreva

To this end, inspired by the recent architectural developments of single-image GANs, we introduce our OSMIS model which enables the synthesis of segmentation masks that are precisely aligned to the generated images in the one-shot regime.

Data Augmentation Image Generation +2

Paper
Code

Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation

2 code implementations • 1 Sep 2022 • Nadine Behrmann, S. Alireza Golestaneh, Zico Kolter, Juergen Gall, Mehdi Noroozi

This paper introduces a unified framework for video action segmentation via sequence to sequence (seq2seq) translation in a fully and timestamp supervised setup.

Ranked #4 on Action Segmentation on Assembly101

Action Segmentation Translation

Paper
Code

Recurrent Transformer Variational Autoencoders for Multi-Action Motion Synthesis

no code implementations • 14 Jun 2022 • Rania Briq, Chuhang Zou, Leonid Pishchulin, Chris Broaddus, Juergen Gall

We consider the problem of synthesizing multi-action human motion sequences of arbitrary lengths.

Motion Synthesis

Paper
Add Code

Ranking Info Noise Contrastive Estimation: Boosting Contrastive Learning via Ranked Positives

1 code implementation • 27 Jan 2022 • David T. Hoffmann, Nadine Behrmann, Juergen Gall, Thomas Brox, Mehdi Noroozi

This paper introduces Ranking Info Noise Contrastive Estimation (RINCE), a new member in the family of InfoNCE losses that preserves a ranked ordering of positive samples.

Contrastive Learning Out-of-Distribution Detection +2

Paper
Code

Intention-based Long-Term Human Motion Anticipation

1 code implementation • 3DV 2021 • Julian Tanke, Chintan Zaveri, Juergen Gall

Recently, a few works have been proposed to model the uncertainty of the future human motion.

Ranked #10 on Human Pose Forecasting on Human3.6M

Human Pose Forecasting

Paper
Code

Adaptive Token Sampling For Efficient Vision Transformers

1 code implementation • 30 Nov 2021 • Mohsen Fayyaz, Soroush Abbasi Koohpayegani, Farnoush Rezaei Jafari, Sunando Sengupta, Hamid Reza Vaezi Joze, Eric Sommerlade, Hamed Pirsiavash, Juergen Gall

Since ATS is a parameter-free module, it can be added to the off-the-shelf pre-trained vision transformers as a plug and play module, thus reducing their GFLOPs without any additional training.

Ranked #13 on Efficient ViTs on ImageNet-1K (with DeiT-S)

Efficient ViTs Video Classification

Paper
Code

Keypoint Message Passing for Video-based Person Re-Identification

1 code implementation • 16 Nov 2021 • Di Chen, Andreas Doering, Shanshan Zhang, Jian Yang, Juergen Gall, Bernt Schiele

Video-based person re-identification (re-ID) is an important technique in visual surveillance systems which aims to match video snippets of people captured by different cameras.

Representation Learning Video-Based Person Re-Identification

Paper
Code

TaylorSwiftNet: Taylor Driven Temporal Modeling for Swift Future Frame Prediction

no code implementations • 27 Oct 2021 • Saber Pourheydari, Emad Bahrami, Mohsen Fayyaz, Gianpiero Francesca, Mehdi Noroozi, Juergen Gall

While recurrent neural networks (RNNs) demonstrate outstanding capabilities for future video frame prediction, they model dynamics in a discrete time space, i. e., they predict the frames sequentially with a fixed temporal step.

Paper
Add Code

Long Short View Feature Decomposition via Contrastive Video Representation Learning

no code implementations • ICCV 2021 • Nadine Behrmann, Mohsen Fayyaz, Juergen Gall, Mehdi Noroozi

We argue that a single representation to capture both types of features is sub-optimal, and propose to decompose the representation space into stationary and non-stationary features via contrastive learning from long and short views, i. e. long video sequences and their shorter sub-sequences.

Action Recognition Action Segmentation +2

Paper
Add Code

FIFA: Fast Inference Approximation for Action Segmentation

no code implementations • 9 Aug 2021 • Yaser Souri, Yazan Abu Farha, Fabien Despinoy, Gianpiero Francesca, Juergen Gall

We apply FIFA on top of state-of-the-art approaches for weakly supervised action segmentation and alignment as well as fully supervised action segmentation.

Ranked #1 on Weakly Supervised Action Segmentation (Transcript) on Breakfast

Segmentation Weakly Supervised Action Segmentation (Transcript)

Paper
Add Code

Using Visual Anomaly Detection for Task Execution Monitoring

1 code implementation • 29 Jul 2021 • Santosh Thoduka, Juergen Gall, Paul G. Plöger

Our method learns to predict the motions that occur during the nominal execution of a task, including camera and robot body motion.

Anomaly Detection Optical Flow Estimation

Paper
Code

Multi-Modal Temporal Convolutional Network for Anticipating Actions in Egocentric Videos

no code implementations • 18 Jul 2021 • Olga Zatsarynna, Yazan Abu Farha, Juergen Gall

This poses a problem for domains such as autonomous driving, where the reaction time is crucial.

Ranked #8 on Action Anticipation on EPIC-KITCHENS-100 (test)

Action Anticipation Autonomous Driving +1

Paper
Add Code

Towards Better Adversarial Synthesis of Human Images from Text

no code implementations • 5 Jul 2021 • Rania Briq, Pratika Kochar, Juergen Gall

This paper proposes an approach that generates multiple 3D human meshes from text.

Image Generation

Paper
Add Code

Learning to Generate Novel Scene Compositions from Single Images and Videos

1 code implementation • 12 May 2021 • Vadim Sushko, Juergen Gall, Anna Khoreva

Training GANs in low-data regimes remains a challenge, as overfitting often leads to memorization or training divergence.

Memorization

Paper
Code

Generating Novel Scene Compositions from Single Images and Videos

1 code implementation • 24 Mar 2021 • Vadim Sushko, Dan Zhang, Juergen Gall, Anna Khoreva

In this work, we introduce SIV-GAN, an unconditional generative model that can generate new scene compositions from a single training image or a single video clip.

Image Generation Memorization

Paper
Code

Temporal Action Segmentation from Timestamp Supervision

1 code implementation • CVPR 2021 • Zhe Li, Yazan Abu Farha, Juergen Gall

To demonstrate the effectiveness of timestamp supervision, we propose an approach to train a segmentation model using only timestamps annotations.

Ranked #4 on Weakly Supervised Action Localization on GTEA

Action Segmentation Segmentation +1

Paper
Code

Iterative Greedy Matching for 3D Human Pose Tracking from Multiple Views

1 code implementation • 24 Jan 2021 • Julian Tanke, Juergen Gall

In this work we propose an approach for estimating 3D human poses of multiple people from a set of calibrated cameras.

3D Human Pose Tracking Multi-Person Pose Estimation

Paper
Code

Discovering Multi-Label Actor-Action Association in a Weakly Supervised Setting

1 code implementation • 21 Jan 2021 • Sovan Biswas, Juergen Gall

Since computing, the probabilities for the full power set becomes intractable as the number of action classes increases, we assign an action set to each detected person under the constraint that the assignment is consistent with the annotation of the video clip.

Action Detection Multi-Label Learning

Paper
Code

Hierarchical Graph-RNNs for Action Detection of Multiple Activities

no code implementations • 21 Jan 2021 • Sovan Biswas, Yaser Souri, Juergen Gall

In this paper, we propose an approach that spatially localizes the activities in a video frame where each person can perform multiple activities at the same time.

Action Detection

Paper
Add Code

Spatial-Temporal Consistency Network for Low-Latency Trajectory Forecasting

no code implementations • ICCV 2021 • Shijie Li, Yanying Zhou, Jinhui Yi, Juergen Gall

Trajectory forecasting is a crucial step for autonomous vehicles and mobile robots in order to navigate and interact safely.

Autonomous Vehicles Navigate +1

Paper
Add Code

You Only Need Adversarial Supervision for Semantic Image Synthesis

1 code implementation • ICLR 2021 • Vadim Sushko, Edgar Schönfeld, Dan Zhang, Juergen Gall, Bernt Schiele, Anna Khoreva

By providing stronger supervision to the discriminator as well as to the generator through spatially- and semantically-aware discriminator feedback, we are able to synthesize images of higher fidelity with better alignment to their input label maps, making the use of the perceptual loss superfluous.

Ranked #1 on Image-to-Image Translation on ADE20K-Outdoor Labels-to-Photos

Image-to-Image Translation Semantic Segmentation

317

Paper
Code

3D CNNs with Adaptive Temporal Feature Resolutions

1 code implementation • CVPR 2021 • Mohsen Fayyaz, Emad Bahrami, Ali Diba, Mehdi Noroozi, Ehsan Adeli, Luc van Gool, Juergen Gall

While the GFLOPs of a 3D CNN can be decreased by reducing the temporal feature resolution within the network, there is no setting that is optimal for all input clips.

Action Recognition

Paper
Code

PoseTrackReID: Dataset Description

no code implementations • 12 Nov 2020 • Andreas Doering, Di Chen, Shanshan Zhang, Bernt Schiele, Juergen Gall

For that reason, we present PoseTrackReID, a large-scale dataset for multi-person pose tracking and video-based person re-ID.

Pose Tracking Video-Based Person Re-Identification

Paper
Add Code

Unsupervised Video Representation Learning by Bidirectional Feature Prediction

no code implementations • 11 Nov 2020 • Nadine Behrmann, Juergen Gall, Mehdi Noroozi

This paper introduces a novel method for self-supervised video representation learning via feature prediction.

Action Recognition Contrastive Learning +1

Paper
Add Code

Pose Refinement Graph Convolutional Network for Skeleton-based Action Recognition

no code implementations • 14 Oct 2020 • Shijie Li, Jinhui Yi, Yazan Abu Farha, Juergen Gall

To this end, the network first refines the poses before they are further processed to recognize the action.

Ranked #24 on Skeleton Based Action Recognition on Kinetics-Skeleton dataset

Action Recognition Skeleton Based Action Recognition

Paper
Add Code

Long-Term Anticipation of Activities with Cycle Consistency

no code implementations • 2 Sep 2020 • Yazan Abu Farha, Qiuhong Ke, Bernt Schiele, Juergen Gall

With the success of deep learning methods in analyzing activities in videos, more attention has recently been focused towards anticipating future activities.

Long Term Anticipation

Paper
Add Code

Multi-scale Interaction for Real-time LiDAR Data Segmentation on an Embedded Platform

2 code implementations • 20 Aug 2020 • Shijie Li, Xieyuanli Chen, Yun Liu, Dengxin Dai, Cyrill Stachniss, Juergen Gall

Real-time semantic segmentation of LiDAR data is crucial for autonomously driving vehicles, which are usually equipped with an embedded platform and have limited computational resources.

Ranked #2 on Real-Time 3D Semantic Segmentation on SemanticKITTI

Autonomous Vehicles Real-Time 3D Semantic Segmentation +1

542

Paper
Code

Audio- and Gaze-driven Facial Animation of Codec Avatars

no code implementations • 11 Aug 2020 • Alexander Richard, Colin Lea, Shugao Ma, Juergen Gall, Fernando de la Torre, Yaser Sheikh

Codec Avatars are a recent class of learned, photorealistic face models that accurately represent the geometry and texture of a person in 3D (i. e., for virtual reality), and are almost indistinguishable from video.

Paper
Add Code

Rethinking 3D LiDAR Point Cloud Segmentation

1 code implementation • 10 Aug 2020 • Shijie Li, Yun Liu, Juergen Gall

Many point-based semantic segmentation methods have been designed for indoor scenarios, but they struggle if they are applied to point clouds that are captured by a LiDAR sensor in an outdoor environment.

Autonomous Driving Point Cloud Segmentation +2

Paper
Code

MS-TCN++: Multi-Stage Temporal Convolutional Network for Action Segmentation

1 code implementation • 16 Jun 2020 • Shijie Li, Yazan Abu Farha, Yun Liu, Ming-Ming Cheng, Juergen Gall

Despite the capabilities of these approaches in capturing temporal dependencies, their predictions suffer from over-segmentation errors.

Ranked #5 on Action Segmentation on Assembly101

Action Segmentation Segmentation

133

Paper
Code

On Evaluating Weakly Supervised Action Segmentation Methods

no code implementations • 19 May 2020 • Yaser Souri, Alexander Richard, Luca Minciullo, Juergen Gall

Action segmentation is the task of temporally segmenting every frame of an untrimmed video.

Action Segmentation Segmentation

Paper
Add Code

Adversarial Synthesis of Human Pose from Text

no code implementations • 1 May 2020 • Yifei Zhang, Rania Briq, Julian Tanke, Juergen Gall

This work focuses on synthesizing human poses from human-level text descriptions.

Generative Adversarial Network

Paper
Add Code

Self-supervised Keypoint Correspondences for Multi-Person Pose Estimation and Tracking in Videos

no code implementations • ECCV 2020 • Umer Rafi, Andreas Doering, Bastian Leibe, Juergen Gall

Instead of training the network for estimating keypoint correspondences on video data, it is trained on a large scale image datasets for human pose estimation using self-supervision.

Multi-Person Pose Estimation Multi-Person Pose Estimation and Tracking +1

Paper
Add Code

SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation

1 code implementation • CVPR 2020 • Mohsen Fayyaz, Juergen Gall

In addition, the network estimates the action labels for each frame.

Action Segmentation

Paper
Code

Discovering Latent Classes for Semi-Supervised Semantic Segmentation

no code implementations • 30 Dec 2019 • Olga Zatsarynna, Johann Sawatzky, Juergen Gall

On unlabeled images, we predict a probability map for latent classes and use it as a supervision signal to learn semantic segmentation.

Segmentation Semi-Supervised Semantic Segmentation

Paper
Add Code

Bonn Activity Maps: Dataset Description

no code implementations • 13 Dec 2019 • Julian Tanke, Oh-Hun Kwon, Patrick Stotko, Radu Alexandru Rosu, Michael Weinmann, Hassan Errami, Sven Behnke, Maren Bennewitz, Reinhard Klein, Andreas Weber, Angela Yao, Juergen Gall

The key prerequisite for accessing the huge potential of current machine learning techniques is the availability of large databases that capture the complex relations of interest.

Activity Recognition

Paper
Add Code

Joint Viewpoint and Keypoint Estimation with Real and Synthetic Data

1 code implementation • 13 Dec 2019 • Pau Panareda Busto, Juergen Gall

The estimation of viewpoints and keypoints effectively enhance object detection methods by extracting valuable traits of the object instances.

Keypoint Estimation Object +2

Paper
Code

Human Motion Anticipation with Symbolic Label

no code implementations • 12 Dec 2019 • Julian Tanke, Andreas Weber, Juergen Gall

We exploit this connection by first anticipating symbolic labels and then generate human motion, conditioned on the human motion input sequence as well as on the forecast labels.

Motion Forecasting

Paper
Add Code

Cross-modal knowledge distillation for action recognition

no code implementations • 10 Oct 2019 • Fida Mohammad Thoker, Juergen Gall

To this end, we extract the knowledge of the trained teacher network for the source modality and transfer it to a small ensemble of student networks for the target modality.

Action Recognition Knowledge Distillation

Paper
Add Code

Uncertainty-Aware Anticipation of Activities

no code implementations • 26 Aug 2019 • Yazan Abu Farha, Juergen Gall

Anticipating future activities in video is a task with many practical applications.

Paper
Add Code

Open Set Domain Adaptation for Image and Action Recognition

1 code implementation • 30 Jul 2019 • Pau Panareda Busto, Ahsan Iqbal, Juergen Gall

Since this assumption is violated under real-world conditions, we propose an approach for open set domain adaptation where the target domain contains instances of categories that are not present in the source domain.

Action Recognition Domain Adaptation +2

Paper
Code

A Hybrid RNN-HMM Approach for Weakly Supervised Temporal Action Segmentation

no code implementations • 3 Jun 2019 • Hilde Kuehne, Alexander Richard, Juergen Gall

Action recognition has become a rapidly developing research field within the last decade.

Action Recognition Action Segmentation +1

Paper
Add Code

Mining YouTube - A dataset for learning fine-grained action concepts from webly supervised video data

1 code implementation • 3 Jun 2019 • Hilde Kuehne, Ahsan Iqbal, Alexander Richard, Juergen Gall

Action recognition is so far mainly focusing on the problem of classification of hand selected preclipped actions and reaching impressive results in this field.

Action Recognition General Classification +1

Paper
Code

Harvesting Information from Captions for Weakly Supervised Semantic Segmentation

no code implementations • 16 May 2019 • Johann Sawatzky, Debayan Banerjee, Juergen Gall

They do not require additional curation as it is the case for the clean class tags used by current weakly supervised approaches and they provide textual context for the classes present in an image.

Image Captioning Image Segmentation +3

Paper
Add Code

3D Semantic Scene Completion from a Single Depth Image using Adversarial Training

no code implementations • 15 May 2019 • Yueh-Tung Chen, Martin Garbade, Juergen Gall

We address the task of 3D semantic scene completion, i. e. , given a single depth image, we predict the semantic labels and occupancy of voxels in a 3D grid representing the scene.

3D Semantic Scene Completion

Paper
Add Code

Unifying Part Detection and Association for Recurrent Multi-Person Pose Estimation

no code implementations • 26 Apr 2019 • Rania Briq, Andreas Doering, Juergen Gall

We propose a joint model of human joint detection and association for 2D multi-person pose estimation (MPPE).

Multi-Person Pose Estimation

Paper
Add Code

Unsupervised learning of action classes with continuous temporal embedding

2 code implementations • CVPR 2019 • Anna Kukleva, Hilde Kuehne, Fadime Sener, Juergen Gall

The task of temporally detecting and segmenting actions in untrimmed videos has seen an increased attention recently.

Paper
Code

What Object Should I Use? - Task Driven Object Detection

1 code implementation • CVPR 2019 • Johann Sawatzky, Yaser Souri, Christian Grund, Juergen Gall

When humans have to solve everyday tasks, they simply pick the objects that are most suitable.

Object object-detection +1

Paper
Code

Fast Weakly Supervised Action Segmentation Using Mutual Consistency

1 code implementation • 5 Apr 2019 • Yaser Souri, Mohsen Fayyaz, Luca Minciullo, Gianpiero Francesca, Juergen Gall

Action segmentation is the task of predicting the actions for each frame of a video.

Ranked #4 on Weakly Supervised Action Segmentation (Transcript) on Breakfast

Segmentation Weakly Supervised Action Segmentation (Transcript)

Paper
Code

SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences

5 code implementations • ICCV 2019 • Jens Behley, Martin Garbade, Andres Milioto, Jan Quenzel, Sven Behnke, Cyrill Stachniss, Juergen Gall

Despite the relevance of semantic scene understanding for this application, there is a lack of a large dataset for this task which is based on an automotive LiDAR.

Ranked #32 on 3D Semantic Segmentation on SemanticKITTI

3D Semantic Segmentation Scene Understanding +2

723

Paper
Code

MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation

2 code implementations • CVPR 2019 • Yazan Abu Farha, Juergen Gall

Temporally locating and classifying action segments in long untrimmed videos is of particular interest to many applications like surveillance and robotics.

Ranked #20 on Action Segmentation on GTEA

Action Segmentation Segmentation

205

Paper
Code

Sequence Prediction using Spectral RNNs

2 code implementations • 13 Dec 2018 • Moritz Wolter, Juergen Gall, Angela Yao

Fourier methods have a long and proven track record as an excellent tool in data processing.

Time Series Time Series Analysis

Paper
Code

Human Motion Prediction via Spatio-Temporal Inpainting

1 code implementation • 13 Dec 2018 • Alejandro Hernandez Ruiz, Juergen Gall, Francesc Moreno-Noguer

First, we represent the data using a spatio-temporal tensor of 3D skeleton coordinates which allows formulating the prediction problem as an inpainting one, for which GANs work particularly well.

Generative Adversarial Network Human motion prediction +3

Paper
Code

Convolutional Simplex Projection Network (CSPN) for Weakly Supervised Semantic Segmentation

1 code implementation • 24 Jul 2018 • Rania Briq, Michael Moeller, Juergen Gall

Weakly supervised semantic segmentation has been a subject of increased interest due to the scarcity of fully annotated images.

Segmentation Weakly supervised Semantic Segmentation +1

Paper
Code

Are good local minima wide in sparse recovery?

no code implementations • 21 Jun 2018 • Michael Moeller, Otmar Loffeld, Juergen Gall, Felix Krahmer

The idea of compressed sensing is to exploit representations in suitable (overcomplete) dictionaries that allow to recover signals far beyond the Nyquist rate provided that they admit a sparse representation in the respective dictionary.

Paper
Add Code

Spatio-Temporal Channel Correlation Networks for Action Classification

no code implementations • ECCV 2018 • Ali Diba, Mohsen Fayyaz, Vivek Sharma, M. Mahdi Arzani, Rahman Yousefzadeh, Juergen Gall, Luc van Gool

Our experiments show that adding STC blocks to current state-of-the-art architectures outperforms the state-of-the-art methods on the HMDB51, UCF101 and Kinetics datasets.

Action Classification Classification +1

Paper
Add Code

NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning

no code implementations • CVPR 2018 • Alexander Richard, Hilde Kuehne, Ahsan Iqbal, Juergen Gall

Video learning is an important task in computer vision and has experienced increasing interest over the recent years.

Ranked #6 on Weakly Supervised Action Segmentation (Transcript) on Breakfast

Incremental Learning Segmentation +3

Paper
Add Code

Joint Flow: Temporal Flow Fields for Multi Person Tracking

no code implementations • 11 May 2018 • Andreas Doering, Umar Iqbal, Juergen Gall

The general formulation of our temporal network allows to rely on any multi person pose estimation approach as spatial network.

Multi-Person Pose Estimation Pose Tracking

Paper
Add Code

Hand Pose Estimation via Latent 2.5D Heatmap Regression

no code implementations • ECCV 2018 • Umar Iqbal, Pavlo Molchanov, Thomas Breuel, Juergen Gall, Jan Kautz

Estimating the 3D pose of a hand is an essential part of human-computer interaction.

3D Hand Pose Estimation regression

Paper
Add Code

Two Stream 3D Semantic Scene Completion

no code implementations • 10 Apr 2018 • Martin Garbade, Yueh-Tung Chen, Johann Sawatzky, Juergen Gall

In this work, we propose a two stream approach that leverages depth information and semantic information, which is inferred from the RGB image, for this task.

Ranked #7 on 3D Semantic Scene Completion on SemanticKITTI

3D Semantic Scene Completion Vocal Bursts Valence Prediction

Paper
Add Code

When will you do what? - Anticipating Temporal Occurrences of Activities

1 code implementation • CVPR 2018 • Yazan Abu Farha, Alexander Richard, Juergen Gall

Analyzing human actions in videos has gained increased attention recently.

Paper
Code

Structural Recurrent Neural Network (SRNN) for Group Activity Analysis

no code implementations • 6 Feb 2018 • Sovan Biswas, Juergen Gall

In this paper, we propose a structural recurrent neural network (SRNN) that uses a series of interconnected RNNs to jointly capture the actions of individuals, their interactions, as well as the group activity.

Paper
Add Code

Material Classification in the Wild: Do Synthesized Training Data Generalise Better than Real-World Training Data?

no code implementations • 9 Nov 2017 • Grigorios Kalliatakis, Anca Sticlaru, George Stamatiadis, Shoaib Ehsan, Ales Leonardis, Juergen Gall, Klaus D. McDonald-Maier

We question the dominant role of real-world training images in the field of material classification by investigating whether synthesized data can generalise more effectively than real-world data.

General Classification Material Classification

Paper
Add Code

PoseTrack: A Benchmark for Human Pose Estimation and Tracking

2 code implementations • CVPR 2018 • Mykhaylo Andriluka, Umar Iqbal, Eldar Insafutdinov, Leonid Pishchulin, Anton Milan, Juergen Gall, Bernt Schiele

In this work, we aim to further advance the state of the art by establishing "PoseTrack", a new large-scale benchmark for video-based human pose estimation and articulated tracking, and bringing together the community of researchers working on visual human analysis.

Ranked #3 on Multi-Person Pose Estimation on PoseTrack2017

Activity Recognition Multi-Person Pose Estimation +2

5,009

Paper
Code

Open Set Domain Adaptation

no code implementations • ICCV 2017 • Pau Panareda Busto, Juergen Gall

The approach learns a mapping from the source to the target domain by jointly solving an assignment problem that labels those target instances that potentially belong to the categories of interest present in the source dataset.

Domain Adaptation

Paper
Add Code

SurfaceNet: An End-to-end 3D Neural Network for Multiview Stereopsis

3 code implementations • ICCV 2017 • Mengqi Ji, Juergen Gall, Haitian Zheng, Yebin Liu, Lu Fang

It takes a set of images and their corresponding camera parameters as input and directly infers the 3D model.

122

Paper
Code

Adaptive Binarization for Weakly Supervised Affordance Segmentation

no code implementations • 10 Jul 2017 • Johann Sawatzky, Juergen Gall

The concept of affordance is important to understand the relevance of object parts for a certain functional interaction.

Binarization Object +1

Paper
Add Code

Weakly Supervised Affordance Detection

1 code implementation • CVPR 2017 • Johann Sawatzky, Abhilash Srikantha, Juergen Gall

Localizing functional regions of objects or affordances is an important aspect of scene understanding and relevant for many robotics applications.

Affordance Detection Object +1

Paper
Code

Recurrent Residual Learning for Action Recognition

no code implementations • 27 Jun 2017 • Ahsan Iqbal, Alexander Richard, Hilde Kuehne, Juergen Gall

In this work, we propose a novel recurrent ConvNet architecture called recurrent residual networks to address the task of action recognition.

Action Recognition Image Classification +1

Paper
Add Code

Action Sets: Weakly Supervised Action Segmentation without Ordering Constraints

1 code implementation • CVPR 2018 • Alexander Richard, Hilde Kuehne, Juergen Gall

Action detection and temporal segmentation of actions in videos are topics of increasing interest.

Action Detection Action Segmentation

Paper
Code

A Dual-Source Approach for 3D Human Pose Estimation from a Single Image

no code implementations • 8 May 2017 • Umar Iqbal, Andreas Doering, Hashim Yasin, Björn Krüger, Andreas Weber, Juergen Gall

To this end, we first convert the motion capture data into a normalized 2D pose space, and separately learn a 2D pose estimation model from the image data.

Ranked #37 on Monocular 3D Human Pose Estimation on Human3.6M

2D Pose Estimation Monocular 3D Human Pose Estimation +2

Paper
Add Code

A Comparison of Directional Distances for Hand Pose Estimation

no code implementations • 3 Apr 2017 • Dimitrios Tzionas, Juergen Gall

Benchmarking methods for 3d hand tracking is still an open problem due to the difficulty of acquiring ground truth data.

Benchmarking Hand Pose Estimation

Paper
Add Code

3D Object Reconstruction from Hand-Object Interactions

3 code implementations • ICCV 2015 • Dimitrios Tzionas, Juergen Gall

Recent advances have enabled 3d object reconstruction approaches using a single off-the-shelf RGB-D camera.

3D Object Reconstruction 3D Reconstruction +1

Paper
Code

Capturing Hand Motion with an RGB-D Sensor, Fusing a Generative Model with Salient Points

2 code implementations • 3 Apr 2017 • Dimitrios Tzionas, Abhilash Srikantha, Pablo Aponte, Juergen Gall

In this work, we propose a framework for hand tracking that can capture the motion of two interacting hands using only a single, inexpensive RGB-D camera.

Pose Tracking

Paper
Code

Weakly Supervised Action Learning with RNN based Fine-to-coarse Modeling

1 code implementation • CVPR 2017 • Alexander Richard, Hilde Kuehne, Juergen Gall

We present an approach for weakly supervised learning of human actions.

Action Segmentation Weakly-supervised Learning

Paper
Code

A Bag-of-Words Equivalent Recurrent Neural Network for Action Recognition

1 code implementation • 23 Mar 2017 • Alexander Richard, Juergen Gall

In this work, we propose a recurrent neural network that is equivalent to the traditional bag-of-words approach but enables for the application of discriminative training.

Action Recognition General Classification +2

Paper
Code

Evaluating Deep Convolutional Neural Networks for Material Classification

no code implementations • 12 Mar 2017 • Grigorios Kalliatakis, Georgios Stamatiadis, Shoaib Ehsan, Ales Leonardis, Juergen Gall, Anca Sticlaru, Klaus D. McDonald-Maier

Determining the material category of a surface from an image is a demanding task in perception that is drawing increasing attention.

Classification General Classification +4

Paper
Add Code

Detection of Human Rights Violations in Images: Can Convolutional Neural Networks help?

no code implementations • 12 Mar 2017 • Grigorios Kalliatakis, Shoaib Ehsan, Maria Fasli, Ales Leonardis, Juergen Gall, Klaus D. McDonald-Maier

We conduct a rigorous evaluation on a common ground by combining this dataset with different state-of-the-art deep convolutional architectures in order to achieve recognition of human rights violations.

Paper
Add Code

PoseTrack: Joint Multi-Person Pose Estimation and Tracking

2 code implementations • CVPR 2017 • Umar Iqbal, Anton Milan, Juergen Gall

In this work, we introduce the challenging problem of joint multi-person pose estimation and tracking of an unknown number of persons in unconstrained videos.

Ranked #1 on Pose Tracking on Multi-Person PoseTrack

Multi-Person Pose Estimation Multi-Person Pose Estimation and Tracking +1

163

Paper
Code

Weakly supervised learning of actions from transcripts

no code implementations • 7 Oct 2016 • Hilde Kuehne, Alexander Richard, Juergen Gall

Our system is based on the idea that, given a sequence of input data and a transcript, i. e. a list of the order the actions occur in the video, it is possible to infer the actions within the video stream, and thus, learn the related action models without the need for any frame-based annotation.

Weakly-supervised Learning

Paper
Add Code

Reconstructing Articulated Rigged Models from RGB-D Videos

no code implementations • 6 Sep 2016 • Dimitrios Tzionas, Juergen Gall

Although commercial and open-source software exist to reconstruct a static object from a sequence recorded with an RGB-D sensor, there is a lack of tools that build rigged models of articulated objects that deform realistically and can be used for tracking or animation.

Clustering Motion Segmentation +1

Paper
Add Code

Multi-Person Pose Estimation with Local Joint-to-Person Associations

1 code implementation • 30 Aug 2016 • Umar Iqbal, Juergen Gall

To this end, we consider multi-person pose estimation as a joint-to-person association problem.

Ranked #8 on Multi-Person Pose Estimation on MPII Multi-Person

Keypoint Detection Multi-Person Pose Estimation +1

259

Paper
Code

Temporal Action Detection Using a Statistical Language Model

1 code implementation • CVPR 2016 • Alexander Richard, Juergen Gall

While current approaches to action recognition on pre-segmented video clips already achieve high accuracies, temporal action detection is still far from comparably good results.

Action Detection Action Recognition +2

Paper
Code

Weakly Supervised Learning of Affordances

no code implementations • 10 May 2016 • Abhilash Srikantha, Juergen Gall

Localizing functional regions of objects or affordances is an important aspect of scene understanding.

Human-Object Interaction Detection Image Segmentation +5

Paper
Add Code

Pose for Action - Action for Pose

no code implementations • 13 Mar 2016 • Umar Iqbal, Martin Garbade, Juergen Gall

In this work we propose to utilize information about human actions to improve pose estimation in monocular videos.

Ranked #5 on Pose Estimation on UPenn Action

Action Recognition Pose Estimation +2

Paper
Add Code

A Dual-Source Approach for 3D Pose Estimation from a Single Image

no code implementations • CVPR 2016 • Hashim Yasin, Umar Iqbal, Björn Krüger, Andreas Weber, Juergen Gall

To integrate both sources, we propose a dual-source approach that combines 2D pose estimation with efficient and robust 3D pose retrieval.

Ranked #22 on 3D Human Pose Estimation on HumanEva-I

2D Pose Estimation 3D Human Pose Estimation +3

Paper
Add Code

An end-to-end generative framework for video segmentation and recognition

no code implementations • 7 Sep 2015 • Hilde Kuehne, Juergen Gall, Thomas Serre

We describe an end-to-end generative approach for the segmentation and recognition of human activities.

Video Segmentation Video Semantic Segmentation

Paper
Add Code

Cooking in the kitchen: Recognizing and Segmenting Human Activities in Videos

no code implementations • 25 Aug 2015 • Hilde Kuehne, Juergen Gall, Thomas Serre

Through extensive system evaluations, we demonstrate that combining compact video representations based on Fisher Vectors with HMM-based modeling yields very significant gains in accuracy and when properly trained with sufficient training samples, structured temporal models outperform unstructured bag-of-word types of models by a large margin on the tested performance metric.

Action Recognition Temporal Action Localization

Paper
Add Code

Capturing Hands in Action using Discriminative Salient Points and Physics Simulation

2 code implementations • 6 Jun 2015 • Dimitrios Tzionas, Luca Ballan, Abhilash Srikantha, Pablo Aponte, Marc Pollefeys, Juergen Gall

Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors.

Paper
Code

From Categories to Subcategories: Large-Scale Image Classification With Partial Class Label Refinement

no code implementations • CVPR 2015 • Marko Ristin, Juergen Gall, Matthieu Guillaumin, Luc van Gool

Compared to approaches that disregard the extra coarse labeled data, we achieve a relative improvement in subcategory classification accuracy of up to 22% in our large-scale image classification experiments.

Classification General Classification +1

Paper
Add Code

Depth sweep regression forests for estimating 3d human pose from images

no code implementations • In Proceedings British Machine Vision Conference 2014 (BMVC 2014) 2014 • Ilya Kostrikov, Juergen Gall

We address the problem of estimating the 3d pose from monocular images.

Ranked #24 on 3D Human Pose Estimation on HumanEva-I

3D Human Pose Estimation Position +1

Paper
Add Code

Incremental Learning of NCM Forests for Large-Scale Image Classification

no code implementations • CVPR 2014 • Marko Ristin, Matthieu Guillaumin, Juergen Gall, Luc van Gool

NCMFs not only outperform conventional random forests, but are also well suited for integrating new classes.

Classification General Classification +2

Paper
Add Code

Human Pose Estimation Using Body Parts Dependent Joint Regressors

no code implementations • CVPR 2013 • Matthias Dantone, Juergen Gall, Christian Leistner, Luc van Gool

The second layer takes the estimated class distributions of the first one into account and is thereby able to predict joint locations by modeling the interdependence and co-occurrence of the parts.

Pose Estimation

Paper
Add Code

Learning Probabilistic Non-Linear Latent Variable Models for Tracking Complex Activities

no code implementations • NeurIPS 2011 • Angela Yao, Juergen Gall, Luc V. Gool, Raquel Urtasun

A common approach for handling the complexity and inherent ambiguities of 3D human pose estimation is to use pose priors learned from training data.

3D Human Pose Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.