Search Results for author: Rainer Stiefelhagen

Found 124 papers, 73 papers with code

Efficient Parameter-free Clustering Using First Neighbor Relations

1 code implementation28 Feb 2019 M. Saquib Sarfraz, Vivek Sharma, Rainer Stiefelhagen

We present a new clustering method in the form of a single clustering equation that is able to directly discover groupings in the data.

Clustering

Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

1 code implementation CVPR 2021 M. Saquib Sarfraz, Naila Murray, Vivek Sharma, Ali Diba, Luc van Gool, Rainer Stiefelhagen

Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos and is an important requirement for many video understanding tasks.

Action Segmentation Clustering +2

CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers

1 code implementation9 Mar 2022 Jiaming Zhang, Huayao Liu, Kailun Yang, Xinxin Hu, Ruiping Liu, Rainer Stiefelhagen

Pixel-wise semantic segmentation of RGB images can be advanced by exploiting complementary features from the supplementary modality (X-modality).

Autonomous Vehicles Image Segmentation +5

MatchFormer: Interleaving Attention in Transformers for Feature Matching

1 code implementation17 Mar 2022 Qing Wang, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen

While detector-based methods coupled with feature descriptors struggle in low-texture scenes, CNN-based methods with a sequential extract-to-match pipeline, fail to make use of the matching capacity of the encoder and tend to overburden the decoder for matching.

Homography Estimation Pose Estimation +1

A Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross Neighborhood Re-Ranking

2 code implementations CVPR 2018 M. Saquib Sarfraz, Arne Schumann, Andreas Eberle, Rainer Stiefelhagen

In contrast to the recent direction of explicitly modeling body parts or correcting for misalignment based on these, we show that a rather straightforward inclusion of acquired camera view and/or the detected joint locations into a convolutional neural network helps to learn a very effective representation.

Person Re-Identification Re-Ranking +1

Content and Colour Distillation for Learning Image Translations with the Spatial Profile Loss

1 code implementation1 Aug 2019 M. Saquib Sarfraz, Constantin Seibold, Haroon Khalid, Rainer Stiefelhagen

In this paper, we propose a novel method of computing the loss directly between the source and target images that enable proper distillation of shape/content and colour/style.

Image Super-Resolution Translation

Large Scale Holistic Video Understanding

1 code implementation ECCV 2020 Ali Diba, Mohsen Fayyaz, Vivek Sharma, Manohar Paluri, Jurgen Gall, Rainer Stiefelhagen, Luc van Gool

HVU is organized hierarchically in a semantic taxonomy that focuses on multi-label and multi-task video understanding as a comprehensive problem that encompasses the recognition of multiple semantic aspects in the dynamic scene.

Action Classification Action Recognition +7

Bending Reality: Distortion-aware Transformers for Adapting to Panoramic Semantic Segmentation

1 code implementation CVPR 2022 Jiaming Zhang, Kailun Yang, Chaoxiang Ma, Simon Reiß, Kunyu Peng, Rainer Stiefelhagen

To get around this domain difference and bring together semantic annotations from pinhole- and 360-degree surround-visuals, we propose to learn object deformations and panoramic image distortions in the Deformable Patch Embedding (DPE) and Deformable MLP (DMLP) components which blend into our Transformer for PAnoramic Semantic Segmentation (Trans4PASS) model.

Scene Understanding Semantic Segmentation +1

Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation

1 code implementation25 Jul 2022 Jiaming Zhang, Kailun Yang, Hao Shi, Simon Reiß, Kunyu Peng, Chaoxiang Ma, Haodong Fu, Philip H. S. Torr, Kaiwei Wang, Rainer Stiefelhagen

In this paper, we address panoramic semantic segmentation which is under-explored due to two critical challenges: (1) image distortions and object deformations on panoramas; (2) lack of semantic annotations in the 360-degree imagery.

Pseudo Label Segmentation +2

MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision

1 code implementation30 Aug 2023 Jianning Li, Zongwei Zhou, Jiancheng Yang, Antonio Pepe, Christina Gsaxner, Gijs Luijten, Chongyu Qu, Tiezheng Zhang, Xiaoxi Chen, Wenxuan Li, Marek Wodzinski, Paul Friedrich, Kangxian Xie, Yuan Jin, Narmada Ambigapathy, Enrico Nasca, Naida Solak, Gian Marco Melito, Viet Duc Vu, Afaque R. Memon, Christopher Schlachta, Sandrine de Ribaupierre, Rajnikant Patel, Roy Eagleson, Xiaojun Chen, Heinrich Mächler, Jan Stefan Kirschke, Ezequiel de la Rosa, Patrick Ferdinand Christ, Hongwei Bran Li, David G. Ellis, Michele R. Aizenberg, Sergios Gatidis, Thomas Küstner, Nadya Shusharina, Nicholas Heller, Vincent Andrearczyk, Adrien Depeursinge, Mathieu Hatt, Anjany Sekuboyina, Maximilian Löffler, Hans Liebl, Reuben Dorent, Tom Vercauteren, Jonathan Shapey, Aaron Kujawa, Stefan Cornelissen, Patrick Langenhuizen, Achraf Ben-Hamadou, Ahmed Rekik, Sergi Pujades, Edmond Boyer, Federico Bolelli, Costantino Grana, Luca Lumetti, Hamidreza Salehi, Jun Ma, Yao Zhang, Ramtin Gharleghi, Susann Beier, Arcot Sowmya, Eduardo A. Garza-Villarreal, Thania Balducci, Diego Angeles-Valdez, Roberto Souza, Leticia Rittner, Richard Frayne, Yuanfeng Ji, Vincenzo Ferrari, Soumick Chatterjee, Florian Dubost, Stefanie Schreiber, Hendrik Mattern, Oliver Speck, Daniel Haehn, Christoph John, Andreas Nürnberger, João Pedrosa, Carlos Ferreira, Guilherme Aresta, António Cunha, Aurélio Campilho, Yannick Suter, Jose Garcia, Alain Lalande, Vicky Vandenbossche, Aline Van Oevelen, Kate Duquesne, Hamza Mekhzoum, Jef Vandemeulebroucke, Emmanuel Audenaert, Claudia Krebs, Timo Van Leeuwen, Evie Vereecke, Hauke Heidemeyer, Rainer Röhrig, Frank Hölzle, Vahid Badeli, Kathrin Krieger, Matthias Gunzer, Jianxu Chen, Timo van Meegdenburg, Amin Dada, Miriam Balzer, Jana Fragemann, Frederic Jonske, Moritz Rempe, Stanislav Malorodov, Fin H. Bahnsen, Constantin Seibold, Alexander Jaus, Zdravko Marinov, Paul F. Jaeger, Rainer Stiefelhagen, Ana Sofia Santos, Mariana Lindo, André Ferreira, Victor Alves, Michael Kamp, Amr Abourayya, Felix Nensa, Fabian Hörst, Alexander Brehmer, Lukas Heine, Yannik Hanusrichter, Martin Weßling, Marcel Dudda, Lars E. Podleska, Matthias A. Fink, Julius Keyl, Konstantinos Tserpes, Moon-Sung Kim, Shireen Elhabian, Hans Lamecker, Dženan Zukić, Beatriz Paniagua, Christian Wachinger, Martin Urschler, Luc Duong, Jakob Wasserthal, Peter F. Hoyer, Oliver Basu, Thomas Maal, Max J. H. Witjes, Gregor Schiele, Ti-chiun Chang, Seyed-Ahmad Ahmadi, Ping Luo, Bjoern Menze, Mauricio Reyes, Thomas M. Deserno, Christos Davatzikos, Behrus Puladi, Pascal Fua, Alan L. Yuille, Jens Kleesiek, Jan Egger

For the medical domain, we present a large collection of anatomical shapes (e. g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems.

Anatomy Mixed Reality

Multi-modal Depression Estimation based on Sub-attentional Fusion

1 code implementation13 Jul 2022 Ping-Cheng Wei, Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

Failure to timely diagnose and effectively treat depression leads to over 280 million people suffering from this psychological disorder worldwide.

Capturing Omni-Range Context for Omnidirectional Segmentation

1 code implementation CVPR 2021 Kailun Yang, Jiaming Zhang, Simon Reiß, Xinxin Hu, Rainer Stiefelhagen

Convolutional Networks (ConvNets) excel at semantic segmentation and have become a vital component for perception in autonomous driving.

Ranked #10 on Semantic Segmentation on DensePASS (using extra training data)

Autonomous Driving Image Segmentation +2

FishDreamer: Towards Fisheye Semantic Completion via Unified Image Outpainting and Segmentation

1 code implementation24 Mar 2023 Hao Shi, Yu Li, Kailun Yang, Jiaming Zhang, Kunyu Peng, Alina Roitberg, Yaozu Ye, Huajian Ni, Kaiwei Wang, Rainer Stiefelhagen

This paper raises the new task of Fisheye Semantic Completion (FSC), where dense texture, structure, and semantics of a fisheye image are inferred even beyond the sensor field-of-view (FoV).

Image Outpainting Semantic Segmentation

MuscleMap: Towards Video-based Activated Muscle Group Estimation in the Wild

1 code implementation2 Mar 2023 Kunyu Peng, David Schneider, Alina Roitberg, Kailun Yang, Jiaming Zhang, Chen Deng, Kaiyu Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen

In this paper, we tackle the new task of video-based Activated Muscle Group Estimation (AMGE) aiming at identifying active muscle regions during physical activity in the wild.

Human Activity Recognition Knowledge Distillation +1

Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation

1 code implementation23 Oct 2022 Zeyun Zhong, David Schneider, Michael Voit, Rainer Stiefelhagen, Jürgen Beyerer

Although human action anticipation is a task which is inherently multi-modal, state-of-the-art methods on well known action anticipation datasets leverage this data by applying ensemble methods and averaging scores of unimodal anticipation networks.

Action Anticipation

Towards Unifying Anatomy Segmentation: Automated Generation of a Full-body CT Dataset via Knowledge Aggregation and Anatomical Guidelines

1 code implementation25 Jul 2023 Alexander Jaus, Constantin Seibold, Kelsey Hermann, Alexandra Walter, Kristina Giske, Johannes Haubold, Jens Kleesiek, Rainer Stiefelhagen

We examine its plausibility and usefulness using three complementary checks: Human expert evaluation which approved the dataset, a Deep Learning usefulness benchmark on the BTCV dataset in which we achieve 85% dice score without using its training dataset, and medical validity checks.

Anatomy Pseudo Label +1

MASS: Multi-Attentional Semantic Segmentation of LiDAR Data for Dense Top-View Understanding

1 code implementation1 Jul 2021 Kunyu Peng, Juncong Fei, Kailun Yang, Alina Roitberg, Jiaming Zhang, Frank Bieder, Philipp Heidenreich, Christoph Stiller, Rainer Stiefelhagen

At the heart of all automated driving systems is the ability to sense the surroundings, e. g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg.

3D Object Detection Graph Attention +4

StoryGraphs: Visualizing Character Interactions as a Timeline

1 code implementation CVPR 2014 Makarand Tapaswi, Martin Bauml, Rainer Stiefelhagen

We present a novel way to automatically summarize and represent the storyline of a TV episode by visualizing character interactions as a chart.

Person Identification

CoBEV: Elevating Roadside 3D Object Detection with Depth and Height Complementarity

1 code implementation4 Oct 2023 Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang

Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety.

feature selection Monocular 3D Object Detection +1

RelaMiX: Exploring Few-Shot Adaptation in Video-based Action Recognition

2 code implementations15 May 2023 Kunyu Peng, Di Wen, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg

Domain adaptation is essential for activity recognition to ensure accurate and robust performance across diverse environments, sensor types, and data sources.

Action Recognition Unsupervised Domain Adaptation

Trans4Trans: Efficient Transformer for Transparent Object Segmentation to Help Visually Impaired People Navigate in the Real World

1 code implementation7 Jul 2021 Jiaming Zhang, Kailun Yang, Angela Constantinescu, Kunyu Peng, Karin Müller, Rainer Stiefelhagen

Common fully glazed facades and transparent objects present architectural barriers and impede the mobility of people with low vision or blindness, for instance, a path detected behind a glass door is inaccessible unless it is correctly perceived and reacted.

Navigate Semantic Segmentation +1

Trans4Trans: Efficient Transformer for Transparent Object and Semantic Scene Segmentation in Real-World Navigation Assistance

1 code implementation20 Aug 2021 Jiaming Zhang, Kailun Yang, Angela Constantinescu, Kunyu Peng, Karin Müller, Rainer Stiefelhagen

In this paper, we build a wearable system with a novel dual-head Transformer for Transparency (Trans4Trans) perception model, which can segment general- and transparent objects.

Ranked #2 on Semantic Segmentation on DADA-seg (using extra training data)

Navigate Scene Segmentation +1

Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments

1 code implementation15 Jul 2023 Ruiping Liu, Jiaming Zhang, Kunyu Peng, Junwei Zheng, Ke Cao, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

Grounded Situation Recognition (GSR) is capable of recognizing and interpreting visual scenes in a contextually intuitive way, yielding salient activities (verbs) and the involved entities (roles) depicted in images.

Grounded Situation Recognition Navigate +1

ISSAFE: Improving Semantic Segmentation in Accidents by Fusing Event-based Data

1 code implementation20 Aug 2020 Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen

Ensuring the safety of all traffic participants is a prerequisite for bringing intelligent vehicles closer to practical applications.

Autonomous Vehicles Benchmarking +2

Exploring Event-driven Dynamic Context for Accident Scene Segmentation

1 code implementation9 Dec 2021 Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen

Moreover, in order to evaluate the segmentation performance in traffic accidents, we provide a pixel-wise annotated accident dataset, namely DADA-seg, which contains a variety of critical scenarios from traffic accidents.

Ranked #3 on Semantic Segmentation on DADA-seg (using extra training data)

Scene Segmentation Segmentation

TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation

2 code implementations27 Feb 2022 Ruiping Liu, Kailun Yang, Alina Roitberg, Jiaming Zhang, Kunyu Peng, Huayao Liu, Yaonan Wang, Rainer Stiefelhagen

Semantic segmentation benchmarks in the realm of autonomous driving are dominated by large pre-trained transformers, yet their widespread adoption is impeded by substantial computational costs and prolonged training durations.

Autonomous Driving Knowledge Distillation +3

Navigating Open Set Scenarios for Skeleton-based Action Recognition

1 code implementation11 Dec 2023 Kunyu Peng, Cheng Yin, Junwei Zheng, Ruiping Liu, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg

In real-world scenarios, human actions often fall outside the distribution of training data, making it crucial for models to recognize known actions and reject unknown ones.

Novelty Detection Open Set Action Recognition +3

Anchor-free Small-scale Multispectral Pedestrian Detection

1 code implementation19 Aug 2020 Alexander Wolpert, Michael Teutsch, M. Saquib Sarfraz, Rainer Stiefelhagen

In this way, we can both simplify the network architecture and achieve higher detection performance, especially for pedestrians under occlusion or at low object resolution.

Autonomous Driving Data Augmentation +3

Panoramic Panoptic Segmentation: Towards Complete Surrounding Understanding via Unsupervised Contrastive Learning

1 code implementation1 Mar 2021 Alexander Jaus, Kailun Yang, Rainer Stiefelhagen

In order to overcome the lack of annotated panoramic images, we propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain.

Contrastive Learning Panoptic Segmentation +2

Delving Deep into One-Shot Skeleton-based Action Recognition with Diverse Occlusions

2 code implementations23 Feb 2022 Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

Yet, the research of data-scarce recognition from skeleton sequences, such as one-shot action recognition, does not explicitly consider occlusions despite their everyday pervasiveness.

Action Classification Action Recognition +2

Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive Learning

1 code implementation21 Jun 2022 Alexander Jaus, Kailun Yang, Rainer Stiefelhagen

In order to overcome the lack of annotated panoramic images, we propose a framework which allows model training on standard pinhole images and transfers the learned features to the panoramic domain in a cost-minimizing way.

Contrastive Learning Domain Generalization +3

Deep Multimodal Feature Encoding for Video Ordering

1 code implementation5 Apr 2020 Vivek Sharma, Makarand Tapaswi, Rainer Stiefelhagen

True understanding of videos comes from a joint analysis of all its modalities: the video frames, the audio track, and any accompanying text such as closed captions.

Action Recognition

DensePASS: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation with Attention-Augmented Context Exchange

1 code implementation13 Aug 2021 Chaoxiang Ma, Jiaming Zhang, Kailun Yang, Alina Roitberg, Rainer Stiefelhagen

First, we formalize the task of unsupervised domain adaptation for panoramic semantic segmentation, where a network trained on labelled examples from the source domain of pinhole camera data is deployed in a different target domain of panoramic images, for which no labels are available.

Segmentation Semantic Segmentation +1

Transfer beyond the Field of View: Dense Panoramic Semantic Segmentation via Unsupervised Domain Adaptation

1 code implementation21 Oct 2021 Jiaming Zhang, Chaoxiang Ma, Kailun Yang, Alina Roitberg, Kunyu Peng, Rainer Stiefelhagen

We look at this problem from the perspective of domain adaptation and bring panoramic semantic segmentation to a setting, where labelled training data originates from a different distribution of conventional pinhole camera images.

Ranked #7 on Semantic Segmentation on DensePASS (using extra training data)

Autonomous Vehicles Segmentation +2

Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video Games

1 code implementation12 Jul 2021 Alina Roitberg, David Schneider, Aulia Djamal, Constantin Seibold, Simon Reiß, Rainer Stiefelhagen

Recognizing Activities of Daily Living (ADL) is a vital process for intelligent assistive robots, but collecting large annotated datasets requires time-consuming temporal labeling and raises privacy concerns, e. g., if the data is collected in a real household.

Action Classification Activity Recognition +2

TransDARC: Transformer-based Driver Activity Recognition with Latent Space Feature Calibration

1 code implementation2 Mar 2022 Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

This module operates in the latent feature-space enriching and diversifying the training set at feature-level in order to improve generalization to novel data appearances, (e. g., sensor changes) and general feature quality.

Human Activity Recognition

Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers

1 code implementation13 Jul 2022 Chang Chen, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen

Humans have an innate ability to sense their surroundings, as they can extract the spatial representation from the egocentric perception and form an allocentric semantic map via spatial transformation and memory updating.

Semantic Segmentation

OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic Segmentation

2 code implementations28 Jul 2023 Fei Teng, Jiaming Zhang, Kunyu Peng, Yaonan Wang, Rainer Stiefelhagen, Kailun Yang

To avoid feature loss during network propagation and simultaneously streamline the redundant information from the light field camera, we present a simple yet very effective Sub-Aperture Fusion Module (SAFM) to embed sub-aperture images into angular features without any additional memory cost.

Autonomous Driving Scene Understanding +1

Accurate Fine-Grained Segmentation of Human Anatomy in Radiographs via Volumetric Pseudo-Labeling

1 code implementation6 Jun 2023 Constantin Seibold, Alexander Jaus, Matthias A. Fink, Moon Kim, Simon Reiß, Ken Herrmann, Jens Kleesiek, Rainer Stiefelhagen

Results: Our resulting segmentation models demonstrated remarkable performance on CXR, with a high average model-annotator agreement between two radiologists with mIoU scores of 0. 93 and 0. 85 for frontal and lateral anatomy, while inter-annotator agreement remained at 0. 95 and 0. 83 mIoU.

Anatomy Computed Tomography (CT) +2

Quantized Distillation: Optimizing Driver Activity Recognition Models for Resource-Constrained Environments

1 code implementation10 Nov 2023 Calvin Tanama, Kunyu Peng, Zdravko Marinov, Rainer Stiefelhagen, Alina Roitberg

The framework enhances 3D MobileNet, a neural architecture optimized for speed in video classification, by incorporating knowledge distillation and model quantization to balance model accuracy and computational efficiency.

Activity Recognition Autonomous Driving +4

Fourier Prompt Tuning for Modality-Incomplete Scene Segmentation

1 code implementation30 Jan 2024 Ruiping Liu, Jiaming Zhang, Kunyu Peng, Yufan Chen, Ke Cao, Junwei Zheng, M. Saquib Sarfraz, Kailun Yang, Rainer Stiefelhagen

Integrating information from multiple modalities enhances the robustness of scene perception systems in autonomous vehicles, providing a more comprehensive and reliable sensory framework.

Autonomous Vehicles Scene Segmentation

Pose2Drone: A Skeleton-Pose-based Framework for Human-Drone Interaction

1 code implementation27 May 2021 Zdravko Marinov, Stanka Vasileva, Qing Wang, Constantin Seibold, Jiaming Zhang, Rainer Stiefelhagen

Our framework provides the functionality to control the movement of the drone with simple arm gestures and to follow the user while keeping a safe distance.

Pose Estimation

Should I take a walk? Estimating Energy Expenditure from Video Data

1 code implementation1 Feb 2022 Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

To study this underresearched task, we introduce Vid2Burn -- an omni-source benchmark for estimating caloric expenditure from video data featuring both, high- and low-intensity activities for which we derive energy expenditure annotations based on models established in medical literature.

Video Recognition

Revisiting Click-based Interactive Video Object Segmentation

1 code implementation3 Mar 2022 Stephane Vujasinovic, Sebastian Bullinger, Stefan Becker, Norbert Scherer-Negenborn, Michael Arens, Rainer Stiefelhagen

While current methods for interactive Video Object Segmentation (iVOS) rely on scribble-based interactions to generate precise object masks, we propose a Click-based interactive Video Object Segmentation (CiVOS) framework to simplify the required user workload as much as possible.

Interactive Video Object Segmentation Object +3

Towards Privacy-Supporting Fall Detection via Deep Unsupervised RGB2Depth Adaptation

1 code implementation23 Aug 2023 Hejun Xiao, Kunyu Peng, Xiangsheng Huang, Alina Roitberg1, Hao Li, Zhaohui Wang, Rainer Stiefelhagen

In this paper, we introduce a privacy-supporting solution that makes the RGB-trained model applicable in depth domain and utilizes depth data at test time for fall detection.

Domain Adaptation

Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision

1 code implementation21 Sep 2023 Yiping Wei, Kunyu Peng, Alina Roitberg, Jiaming Zhang, Junwei Zheng, Ruiping Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

These works overlooked the differences in performance among modalities, which led to the propagation of erroneous knowledge between modalities while only three fundamental modalities, i. e., joints, bones, and motions are used, hence no additional modalities are explored.

Action Recognition Knowledge Distillation +3

Affect-DML: Context-Aware One-Shot Recognition of Human Affect using Deep Metric Learning

1 code implementation30 Nov 2021 Kunyu Peng, Alina Roitberg, David Schneider, Marios Koulakis, Kailun Yang, Rainer Stiefelhagen

Human affect recognition is a well-established research area with numerous applications, e. g., in psychological care, but existing methods assume that all emotions-of-interest are given a priori as annotated training examples.

Emotion Recognition Metric Learning +1

EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving

1 code implementation28 Feb 2024 Jiacheng Lin, Jiajun Chen, Kunyu Peng, Xuan He, Zhiyong Li, Rainer Stiefelhagen, Kailun Yang

This paper introduces the task of Auditory Referring Multi-Object Tracking (AR-MOT), which dynamically tracks specific objects in a video sequence based on audio expressions and appears as a challenging problem in autonomous driving.

Autonomous Driving Multi-Object Tracking +1

Skeleton-Based Human Action Recognition with Noisy Labels

1 code implementation15 Mar 2024 Yi Xu, Kunyu Peng, Di Wen, Ruiping Liu, Junwei Zheng, Yufan Chen, Jiaming Zhang, Alina Roitberg, Kailun Yang, Rainer Stiefelhagen

In this study, we bridge this gap by implementing a framework that augments well-established skeleton-based human action recognition methods with label-denoising strategies from various research areas to serve as the initial benchmark.

Action Recognition Denoising +3

Multimodal Generation of Novel Action Appearances for Synthetic-to-Real Recognition of Activities of Daily Living

1 code implementation3 Aug 2022 Zdravko Marinov, David Schneider, Alina Roitberg, Rainer Stiefelhagen

We tackle this challenge and introduce an activity domain generation framework which creates novel ADL appearances (novel domains) from different existing activity modalities (source domains) inferred from video training data.

Activity Recognition multimodal generation +1

Weakly Supervised Object Discovery by Generative Adversarial & Ranking Networks

no code implementations22 Nov 2017 Ali Diba, Vivek Sharma, Rainer Stiefelhagen, Luc van Gool

We approach GANs with a novel training method and learning objective, to discover multiple object instances for three cases: 1) synthesizing a picture of a specific object within a cluttered scene; 2) localizing different categories in images for weakly supervised object detection; and 3) improving object discov- ery in object detection pipelines.

Object object-detection +2

Classification Driven Dynamic Image Enhancement

no code implementations20 Oct 2017 Vivek Sharma, Ali Diba, Davy Neven, Michael S. Brown, Luc van Gool, Rainer Stiefelhagen

In this paper, we are interested in learning CNNs that can emulate image enhancement and restoration, but with the overall goal to improve image classification and not necessarily human perception.

Classification General Classification +3

3D Trajectory Reconstruction of Dynamic Objects Using Planarity Constraints

no code implementations16 Nov 2017 Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen

We apply Structure from Motion techniques to object and background images to determine for each frame camera poses relative to object instances and background structures.

Object Optical Flow Estimation +1

Deep View-Sensitive Pedestrian Attribute Inference in an end-to-end Model

no code implementations19 Jul 2017 M. Saquib Sarfraz, Arne Schumann, Yan Wang, Rainer Stiefelhagen

The visual cues hinting at attributes can be strongly localized and inference of person attributes such as hair, backpack, shorts, etc., are highly dependent on the acquired view of the pedestrian.

Attribute Multi-Label Image Classification +2

Automatic Discovery, Association Estimation and Learning of Semantic Attributes for a Thousand Categories

no code implementations CVPR 2017 Ziad Al-Halah, Rainer Stiefelhagen

Furthermore, we demonstrate that our model outperforms the state-of-the-art in zero-shot learning on three data sets: ImageNet, Animals with Attributes and aPascal/aYahoo.

Attribute Zero-Shot Learning

Relaxed Earth Mover's Distances for Chain- and Tree-connected Spaces and their use as a Loss Function in Deep Learning

no code implementations22 Nov 2016 Manuel Martinez, Monica Haurilet, Ziad Al-Halah, Makarand Tapaswi, Rainer Stiefelhagen

The Earth Mover's Distance (EMD) computes the optimal cost of transforming one distribution into another, given a known transport metric between them.

Small Data Image Classification

Deep Perceptual Mapping for Cross-Modal Face Recognition

no code implementations20 Jan 2016 M. Saquib Sarfraz, Rainer Stiefelhagen

Our method bridges the drop in performance due to the modality gap by more than 40\%.

Face Recognition

How to Transfer? Zero-Shot Object Recognition via Hierarchical Transfer of Semantic Attributes

no code implementations1 Apr 2016 Ziad Al-Halah, Rainer Stiefelhagen

We propose to capture these variations in a hierarchical model that expands the knowledge source with additional abstraction levels of attributes.

Attribute Object Recognition +1

What's the point? Frame-wise Pointing Gesture Recognition with Latent-Dynamic Conditional Random Fields

no code implementations20 Oct 2015 Christian Wittner, Boris Schauerte, Rainer Stiefelhagen

We use Latent-Dynamic Conditional Random Fields to perform skeleton-based pointing gesture classification at each time instance of a video sequence, where we achieve a frame-wise pointing accuracy of roughly 83%.

General Classification Gesture Recognition

Deep Perceptual Mapping for Thermal to Visible Face Recognition

no code implementations10 Jul 2015 M. Saquib Sarfraz, Rainer Stiefelhagen

Cross modal face matching between the thermal and visible spectrum is a much de- sired capability for night-time surveillance and security applications.

Face Recognition

On the Distribution of Salient Objects in Web Images and its Influence on Salient Object Detection

no code implementations10 Jan 2015 Boris Schauerte, Rainer Stiefelhagen

Tseng et al. have shown that the photographer's tendency to place interesting objects in the center is a likely cause for the center bias of eye fixations.

Object object-detection +3

Taming the Cross Entropy Loss

no code implementations11 Oct 2018 Manuel Martinez, Rainer Stiefelhagen

We present the Tamed Cross Entropy (TCE) loss function, a robust derivative of the standard Cross Entropy (CE) loss used in deep learning for classification tasks.

General Classification

Informed Democracy: Voting-based Novelty Detection for Action Recognition

no code implementations30 Oct 2018 Alina Roitberg, Ziad Al-Halah, Rainer Stiefelhagen

While it is common in activity recognition to assume a closed-set setting, i. e. test samples are always of training categories, this assumption is impractical in a real-world scenario.

Action Classification Action Recognition +2

Classification-Driven Dynamic Image Enhancement

no code implementations CVPR 2018 Vivek Sharma, Ali Diba, Davy Neven, Michael S. Brown, Luc van Gool, Rainer Stiefelhagen

In this paper, we are interested in learning CNNs that can emulate image enhancement and restoration, but with the overall goal to improve image classification and not necessarily human perception.

Classification General Classification +3

3D Vehicle Trajectory Reconstruction in Monocular Video Data Using Environment Structure Constraints

no code implementations ECCV 2018 Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen

We apply Structure from Motion techniques to vehicle and background images to determine for each frame camera poses relative to vehicle instances and background structures.

Optical Flow Estimation Semantic Segmentation

Can Image Enhancement be Beneficial to Find Smoke Images in Laparoscopic Surgery?

no code implementations27 Dec 2018 Congcong Wang, Vivek Sharma, Yu Fan, Faouzi Alaya Cheikh, Azeddine Beghdadi, Ole Jacob Elle, Rainer Stiefelhagen

For feature extraction, we use statistical features based on bivariate histogram distribution of gradient magnitude~(GM) and Laplacian of Gaussian~(LoG).

General Classification Image Enhancement +1

Book2Movie: Aligning Video Scenes With Book Chapters

no code implementations CVPR 2015 Makarand Tapaswi, Martin Bauml, Rainer Stiefelhagen

Such an alignment facilitates finding differences between the adaptation and the original source, and also acts as a basis for deriving rich descriptions from the novel for the video clips.

Video Alignment

DynamoNet: Dynamic Action and Motion Network

no code implementations ICCV 2019 Ali Diba, Vivek Sharma, Luc van Gool, Rainer Stiefelhagen

With these overall objectives, to this end, we introduce a novel unified spatio-temporal 3D-CNN architecture (DynamoNet) that jointly optimizes the video classification and learning motion representation by predicting future frames as a multi-task learning problem.

Action Recognition Classification +5

Detective: An Attentive Recurrent Model for Sparse Object Detection

no code implementations25 Apr 2020 Amine Kechaou, Manuel Martinez, Monica Haurilet, Rainer Stiefelhagen

At each iteration, our decoder focuses on the relevant parts of the image using an attention mechanism, and then estimates the object's class and the bounding box coordinates.

Object object-detection +1

Can we cover navigational perception needs of the visually impaired by panoptic segmentation?

no code implementations20 Jul 2020 Wei Mao, Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen

Navigational perception for visually impaired people has been substantially promoted by both classic and deep learning based segmentation methods.

Instance Segmentation Panoptic Segmentation +1

Unsupervised Domain Adaptation by Uncertain Feature Alignment

1 code implementation14 Sep 2020 Tobias Ringwald, Rainer Stiefelhagen

Unsupervised domain adaptation (UDA) deals with the adaptation of models from a given source domain with labeled data to an unlabeled target domain.

Unsupervised Domain Adaptation

Uncertainty-sensitive Activity Recognition: a Reliability Benchmark and the CARING Models

no code implementations2 Jan 2021 Alina Roitberg, Monica Haurilet, Manuel Martinez, Rainer Stiefelhagen

While temperature scaling alone drastically improves the reliability of the confidence values, our CARING method consistently leads to the best uncertainty estimates in all benchmark settings.

Action Recognition Image Classification

Perception Framework through Real-Time Semantic Segmentation and Scene Recognition on a Wearable System for the Visually Impaired

no code implementations6 Mar 2021 Yingzhi Zhang, Haoye Chen, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen

As the scene information, including objectness and scene type, are important for people with visual impairment, in this work we present a multi-task efficient perception system for the scene parsing and recognition tasks.

Real-Time Semantic Segmentation Scene Recognition

Vi2CLR: Video and Image for Visual Contrastive Learning of Representation

no code implementations ICCV 2021 Ali Diba, Vivek Sharma, Reza Safdari, Dariush Lotfi, Saquib Sarfraz, Rainer Stiefelhagen, Luc van Gool

In this paper, we introduce a novel self-supervised visual representation learning method which understands both images and videos in a joint learning fashion.

Action Recognition Clustering +2

UBR$^2$S: Uncertainty-Based Resampling and Reweighting Strategy for Unsupervised Domain Adaptation

1 code implementation22 Oct 2021 Tobias Ringwald, Rainer Stiefelhagen

Unsupervised domain adaptation (UDA) deals with the adaptation process of a model to an unlabeled target domain while annotated data is only available for a given source domain.

Unsupervised Domain Adaptation

Adaptiope: A Modern Benchmark for Unsupervised Domain Adaptation

1 code implementation WACV 2021 Tobias Ringwald, Rainer Stiefelhagen

Unsupervised domain adaptation (UDA) deals with the adaptation process of a given source domain with labeled training data to a target domain for which only unannotated data is available.

Unsupervised Domain Adaptation

Certainty Volume Prediction for Unsupervised Domain Adaptation

1 code implementation3 Nov 2021 Tobias Ringwald, Rainer Stiefelhagen

Unsupervised domain adaptation (UDA) deals with the problem of classifying unlabeled target domain data while labeled data is only available for a different source domain.

Unsupervised Domain Adaptation

Reference-guided Pseudo-Label Generation for Medical Semantic Segmentation

no code implementations1 Dec 2021 Constantin Seibold, Simon Reiß, Jens Kleesiek, Rainer Stiefelhagen

Following this thought, we use a small number of labeled images as reference material and match pixels in an unlabeled image to the semantics of the best fitting pixel in a reference set.

Anatomy Pseudo Label +2

Continuous Self-Localization on Aerial Images Using Visual and Lidar Sensors

no code implementations7 Mar 2022 Florian Fervers, Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen

Our method is the first to utilize on-board cameras in an end-to-end differentiable model for metric self-localization on unseen orthophotos.

Metric Learning

A Comparative Analysis of Decision-Level Fusion for Multimodal Driver Behaviour Understanding

no code implementations10 Apr 2022 Alina Roitberg, Kunyu Peng, Zdravko Marinov, Constantin Seibold, David Schneider, Rainer Stiefelhagen

Visual recognition inside the vehicle cabin leads to safer driving and more intuitive human-vehicle interaction but such systems face substantial obstacles as they need to capture different granularities of driver behaviour while dealing with highly limited body visibility and changing illumination.

Is my Driver Observation Model Overconfident? Input-guided Calibration Networks for Reliable and Interpretable Confidence Estimates

no code implementations10 Apr 2022 Alina Roitberg, Kunyu Peng, David Schneider, Kailun Yang, Marios Koulakis, Manuel Martinez, Rainer Stiefelhagen

In this work, we for the first time examine how well the confidence values of modern driver observation models indeed match the probability of the correct outcome and show that raw neural network-based approaches tend to significantly overestimate their prediction quality.

Action Recognition Image Classification

ModSelect: Automatic Modality Selection for Synthetic-to-Real Domain Generalization

no code implementations19 Aug 2022 Zdravko Marinov, Alina Roitberg, David Schneider, Rainer Stiefelhagen

Modality selection is an important step when designing multimodal systems, especially in the case of cross-domain activity recognition as certain modalities are more robust to domain shift than others.

Cross-Domain Activity Recognition Domain Generalization

Detailed Annotations of Chest X-Rays via CT Projection for Report Understanding

no code implementations7 Oct 2022 Constantin Seibold, Simon Reiß, Saquib Sarfraz, Matthias A. Fink, Victoria Mayer, Jan Sellner, Moon Sung Kim, Klaus H. Maier-Hein, Jens Kleesiek, Rainer Stiefelhagen

To exploit anatomical structures in this scenario, we present a sophisticated automatic pipeline to gather and integrate human bodily structures from computed tomography datasets, which we incorporate in our PAXRay: A Projected dataset for the segmentation of Anatomical structures in X-Ray data.

Anatomy Phrase Grounding

Uncertainty-aware Vision-based Metric Cross-view Geolocalization

no code implementations CVPR 2023 Florian Fervers, Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen

This paper proposes a novel method for vision-based metric cross-view geolocalization (CVGL) that matches the camera images captured from a ground-based vehicle with an aerial image to determine the vehicle's geo-pose.

Autonomous Driving Pseudo Label

Guiding the Guidance: A Comparative Analysis of User Guidance Signals for Interactive Segmentation of Volumetric Images

no code implementations13 Mar 2023 Zdravko Marinov, Rainer Stiefelhagen, Jens Kleesiek

To address this, we conduct a comparative study of existing guidance signals by training interactive models with different signals and parameter settings to identify crucial parameters for the model's design.

Anatomy Interactive Segmentation +1

Tightly-Coupled LiDAR-Visual SLAM Based on Geometric Features for Mobile Agents

no code implementations15 Jul 2023 Ke Cao, Ruiping Liu, Ze Wang, Kunyu Peng, Jiaming Zhang, Junwei Zheng, Zhifeng Teng, Kailun Yang, Rainer Stiefelhagen

On the other hand, the entire line segment detected by the visual subsystem overcomes the limitation of the LiDAR subsystem, which can only perform the local calculation for geometric features.

Autonomous Navigation Pose Estimation +2

Deep Interactive Segmentation of Medical Images: A Systematic Review and Taxonomy

no code implementations23 Nov 2023 Zdravko Marinov, Paul F. Jäger, Jan Egger, Jens Kleesiek, Rainer Stiefelhagen

Interactive segmentation is a crucial research area in medical image analysis aiming to boost the efficiency of costly annotations by incorporating human feedback.

Interactive Segmentation

C-BEV: Contrastive Bird's Eye View Training for Cross-View Image Retrieval and 3-DoF Pose Estimation

no code implementations13 Dec 2023 Florian Fervers, Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen

To find the geolocation of a street-view image, cross-view geolocalization (CVGL) methods typically perform image retrieval on a database of georeferenced aerial images and determine the location from the visually most similar match.

Image Retrieval Pose Estimation +1

RoDLA: Benchmarking the Robustness of Document Layout Analysis Models

no code implementations21 Mar 2024 Yufan Chen, Jiaming Zhang, Kunyu Peng, Junwei Zheng, Ruiping Liu, Philip Torr, Rainer Stiefelhagen

To address this, we are the first to introduce a robustness benchmark for DLA models, which includes 450K document images of three datasets.

Benchmarking Document Layout Analysis

Rethinking Annotator Simulation: Realistic Evaluation of Whole-Body PET Lesion Interactive Segmentation Methods

no code implementations2 Apr 2024 Zdravko Marinov, Moon Kim, Jens Kleesiek, Rainer Stiefelhagen

In an initial user study involving four annotators, we assess existing robot users using our proposed metrics and find that robot users significantly deviate in performance and annotation behavior compared to real annotators.

Interactive Segmentation Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.