1 code implementation • 24 Dec 2024 • Kunyu Peng, Di Wen, Sarfraz M. Saquib, Yufan Chen, Junwei Zheng, David Schneider, Kailun Yang, Jiamin Wu, Alina Roitberg, Rainer Stiefelhagen
Open-Set Domain Generalization (OSDG) is a challenging task requiring models to accurately predict familiar categories while minimizing confidence for unknown categories to effectively reject them in unseen domains.
no code implementations • 4 Dec 2024 • Ruiping Liu, Jiaming Zhang, Angela Schön, Karin Müller, Junwei Zheng, Kailun Yang, Kathrin Gerling, Rainer Stiefelhagen
Assistive technology can be leveraged by blind people when searching for objects in their daily lives.
1 code implementation • 25 Nov 2024 • Jie Hu, Junwei Zheng, Jiale Wei, Jiaming Zhang, Rainer Stiefelhagen
Wide-FoV cameras, like fisheye and panoramic setups, are essential for broader perception but introduce significant distortions in 180{\deg} and 360{\deg} images, complicating dense prediction tasks.
1 code implementation • 21 Nov 2024 • Qihao Yuan, Jiaming Zhang, Kailai Li, Rainer Stiefelhagen
3D visual grounding (3DVG) aims to locate objects in a 3D scene with natural language descriptions.
1 code implementation • 31 Oct 2024 • David Schneider, Simon Reiß, Marco Kugler, Alexander Jaus, Kunyu Peng, Susanne Sutschet, M. Saquib Sarfraz, Sven Matthiesen, Rainer Stiefelhagen
In this work, we address this issue by establishing Muscles in Time (MinT), a large-scale synthetic muscle activation dataset.
no code implementations • 24 Oct 2024 • Alexander Jaus, Constantin Seibold, Simon Reiß, Zdravko Marinov, Keyi Li, Zeling Ye, Stefan Krieg, Jens Kleesiek, Rainer Stiefelhagen
We present Connected-Component~(CC)-Metrics, a novel semantic segmentation evaluation protocol, targeted to align existing semantic segmentation metrics to a multi-instance detection scenario in which each connected component matters.
no code implementations • 22 Oct 2024 • Lena Heinemann, Alexander Jaus, Zdravko Marinov, Moon Kim, Maria Francesca Spadea, Jens Kleesiek, Rainer Stiefelhagen
Within this work, we introduce LIMIS: The first purely language-based interactive medical image segmentation model.
no code implementations • 22 Oct 2024 • David Schneider, Sina Sajadmanesh, Vikash Sehwag, Saquib Sarfraz, Rainer Stiefelhagen, Lingjuan Lyu, Vivek Sharma
Prevalent methods tackling this problem use differential privacy (DP) or obfuscation techniques to protect the privacy of individuals.
1 code implementation • 26 Sep 2024 • Kunyu Peng, Di Wen, Kailun Yang, Ao Luo, Yufan Chen, Jia Fu, M. Saquib Sarfraz, Alina Roitberg, Rainer Stiefelhagen
In this paper, we observe that an adaptive domain scheduler benefits more in OSDG compared with prefixed sequential and random domain schedulers.
1 code implementation • 25 Sep 2024 • Florian Fervers, Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen
This work presents a method that is able to predict the geolocation of a street-view photo taken in the wild within a state-sized search region by matching against a database of aerial reference imagery.
no code implementations • 21 Sep 2024 • Xin Jiang, Junwei Zheng, Ruiping Liu, Jiahang Li, Jiaming Zhang, Sven Matthiesen, Rainer Stiefelhagen
As Vision-Language Models (VLMs) advance, human-centered Assistive Technologies (ATs) for helping People with Visual Impairments (PVIs) are evolving into generalists, capable of performing multiple tasks simultaneously.
1 code implementation • 20 Sep 2024 • Jiale Wei, Junwei Zheng, Ruiping Liu, Jie Hu, Jiaming Zhang, Rainer Stiefelhagen
This work advances BEV semantic mapping in autonomous driving, paving the way for more advanced and reliable autonomous systems.
1 code implementation • 20 Sep 2024 • Alexander Jaus, Simon Reiß, Jens Kleesiek, Rainer Stiefelhagen
In this work, we describe our approach to compete in the autoPET3 datacentric track.
1 code implementation • 6 Aug 2024 • Jonas Schmitt, Ruiping Liu, Junwei Zheng, Jiaming Zhang, Rainer Stiefelhagen
Extensive experiments demonstrate the generalizability of our framework, encompassing both convolutional neural network (CNN) and transformer models, as well as image classification and segmentation tasks.
1 code implementation • 31 Jul 2024 • Stéphane Vujasinović, Stefan Becker, Sebastian Bullinger, Norbert Scherer-Negenborn, Michael Arens, Rainer Stiefelhagen
In this paper, we introduce a variant of video object segmentation (VOS) that bridges interactive and semi-automatic approaches, termed Lazy Video Object Segmentation (ziVOS).
1 code implementation • 8 Jul 2024 • Alexander Jaus, Constantin Seibold, Simon Reiß, Lukas Heine, Anton Schily, Moon Kim, Fin Hendrik Bahnsen, Ken Herrmann, Rainer Stiefelhagen, Jens Kleesiek
Pathological structures in medical images are typically deviations from the expected anatomy of a patient.
1 code implementation • 2 Jul 2024 • Junwei Zheng, Ruiping Liu, Yufan Chen, Kunyu Peng, Chengzhi Wu, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen
To tackle this problem, in this work, we define a new task termed Open Panoramic Segmentation (OPS), where models are trained with FoV-restricted pinhole images in the source domain in an open-vocabulary setting while evaluated with FoV-open panoramic images in the target domain, enabling the zero-shot open panoramic semantic segmentation ability of models.
1 code implementation • 2 Jul 2024 • Yihong Cao, Jiaming Zhang, Hao Shi, Kunyu Peng, Yuhongxuan Zhang, HUI ZHANG, Rainer Stiefelhagen, Kailun Yang
Our method achieves state-of-the-art performance on the BlendPASS dataset, reaching a remarkable mAPQ of 26. 58% and mIoU of 43. 66%.
1 code implementation • 2 Jul 2024 • Kunyu Peng, Jia Fu, Kailun Yang, Di Wen, Yufan Chen, Ruiping Liu, Junwei Zheng, Jiaming Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg
Since these existing methods underperform on RAVAR, we introduce RefAtomNet -- a novel cross-stream attention-driven method specialized for the unique challenges of RAVAR: the need to interpret a textual referring expression for the targeted individual, utilize this reference to guide the spatial localization and harvest the prediction of the atomic actions for the referring person.
1 code implementation • 14 Jun 2024 • Tu Anh Dinh, Carlos Mullov, Leonard Bärmann, Zhaolin Li, Danni Liu, Simon Reiß, Jueun Lee, Nathan Lerzer, Fabian Ternava, Jianfeng Gao, Tobias Röddiger, Alexander Waibel, Tamim Asfour, Michael Beigl, Rainer Stiefelhagen, Carsten Dachsbacher, Klemens Böhm, Jan Niehues
We evaluate the performance of various state-of-the-art LLMs on our new benchmark.
no code implementations • 29 May 2024 • Omar Moured, Shahid Ali Farooqui, Karin Muller, Sharifeh Fadaeijouybari, Thorsten Schwarz, Mohammed Javed, Rainer Stiefelhagen
We address this challenge by retrieving high-quality alt-texts from similar chart images, serving as a reference for the user when creating alt-texts.
1 code implementation • 29 May 2024 • Omar Moured, Sara Alzalabny, Anas Osman, Thorsten Schwarz, Karin Muller, Rainer Stiefelhagen
Visualizations, such as charts, are crucial for interpreting complex data.
no code implementations • 29 May 2024 • David Wilkening, Omar Moured, Thorsten Schwarz, Karin Muller, Rainer Stiefelhagen
Exam documents are essential educational materials for exam preparation.
no code implementations • 22 May 2024 • Omar Moured, Jiaming Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen
Chart summarization is a crucial task for blind and visually impaired individuals as it is their primary means of accessing and interpreting graphical data.
no code implementations • 2 Apr 2024 • Zdravko Marinov, Moon Kim, Jens Kleesiek, Rainer Stiefelhagen
In an initial user study involving four annotators, we assess existing robot users using our proposed metrics and find that robot users significantly deviate in performance and annotation behavior compared to real annotators.
no code implementations • CVPR 2024 • Yufan Chen, Jiaming Zhang, Kunyu Peng, Junwei Zheng, Ruiping Liu, Philip Torr, Rainer Stiefelhagen
To address this, we are the first to introduce a robustness benchmark for DLA models, which includes 450K document images of three datasets.
1 code implementation • 15 Mar 2024 • Yi Xu, Kunyu Peng, Di Wen, Ruiping Liu, Junwei Zheng, Yufan Chen, Jiaming Zhang, Alina Roitberg, Kailun Yang, Rainer Stiefelhagen
In this study, we bridge this gap by implementing a framework that augments well-established skeleton-based human action recognition methods with label-denoising strategies from various research areas to serve as the initial benchmark.
2 code implementations • 28 Feb 2024 • Jiacheng Lin, Jiajun Chen, Kunyu Peng, Xuan He, Zhiyong Li, Rainer Stiefelhagen, Kailun Yang
This paper introduces the task of Auditory Referring Multi-Object Tracking (AR-MOT), which dynamically tracks specific objects in a video sequence based on audio expressions and appears as a challenging problem in autonomous driving.
1 code implementation • 30 Jan 2024 • Ruiping Liu, Jiaming Zhang, Kunyu Peng, Yufan Chen, Ke Cao, Junwei Zheng, M. Saquib Sarfraz, Kailun Yang, Rainer Stiefelhagen
Integrating information from multiple modalities enhances the robustness of scene perception systems in autonomous vehicles, providing a more comprehensive and reliable sensory framework.
no code implementations • 13 Dec 2023 • Florian Fervers, Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen
To find the geolocation of a street-view image, cross-view geolocalization (CVGL) methods typically perform image retrieval on a database of georeferenced aerial images and determine the location from the visually most similar match.
1 code implementation • 11 Dec 2023 • Kunyu Peng, Cheng Yin, Junwei Zheng, Ruiping Liu, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg
In real-world scenarios, human actions often fall outside the distribution of training data, making it crucial for models to recognize known actions and reject unknown ones.
1 code implementation • 24 Nov 2023 • Matthias Hadlich, Zdravko Marinov, Moon Kim, Enrico Nasca, Jens Kleesiek, Rainer Stiefelhagen
Deep learning has revolutionized the accurate segmentation of diseases in medical imaging.
no code implementations • 23 Nov 2023 • Zdravko Marinov, Paul F. Jäger, Jan Egger, Jens Kleesiek, Rainer Stiefelhagen
Interactive segmentation is a crucial research area in medical image analysis aiming to boost the efficiency of costly annotations by incorporating human feedback.
1 code implementation • 10 Nov 2023 • Calvin Tanama, Kunyu Peng, Zdravko Marinov, Rainer Stiefelhagen, Alina Roitberg
The framework enhances 3D MobileNet, a neural architecture optimized for speed in video classification, by incorporating knowledge distillation and model quantization to balance model accuracy and computational efficiency.
1 code implementation • 4 Oct 2023 • Hao Shi, Chengshan Pang, Jiaming Zhang, Kailun Yang, Yuhao Wu, Huajian Ni, Yining Lin, Rainer Stiefelhagen, Kaiwei Wang
Roadside camera-driven 3D object detection is a crucial task in intelligent transportation systems, which extends the perception range beyond the limitations of vision-centric vehicles and enhances road safety.
Ranked #2 on 3D Object Detection on Rope3D
1 code implementation • 21 Sep 2023 • Yiping Wei, Kunyu Peng, Alina Roitberg, Jiaming Zhang, Junwei Zheng, Ruiping Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen
These works overlooked the differences in performance among modalities, which led to the propagation of erroneous knowledge between modalities while only three fundamental modalities, i. e., joints, bones, and motions are used, hence no additional modalities are explored.
1 code implementation • 21 Sep 2023 • Matthias Hadlich, Zdravko Marinov, Rainer Stiefelhagen
Tumor segmentation in medical imaging is crucial and relies on precise delineation.
1 code implementation • 21 Sep 2023 • Yifei Chen, Kunyu Peng, Alina Roitberg, David Schneider, Jiaming Zhang, Junwei Zheng, Ruiping Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen
To empower models with the capacity to address occlusion, we propose a simple and effective method.
1 code implementation • 30 Aug 2023 • Jianning Li, Zongwei Zhou, Jiancheng Yang, Antonio Pepe, Christina Gsaxner, Gijs Luijten, Chongyu Qu, Tiezheng Zhang, Xiaoxi Chen, Wenxuan Li, Marek Wodzinski, Paul Friedrich, Kangxian Xie, Yuan Jin, Narmada Ambigapathy, Enrico Nasca, Naida Solak, Gian Marco Melito, Viet Duc Vu, Afaque R. Memon, Christopher Schlachta, Sandrine de Ribaupierre, Rajnikant Patel, Roy Eagleson, Xiaojun Chen, Heinrich Mächler, Jan Stefan Kirschke, Ezequiel de la Rosa, Patrick Ferdinand Christ, Hongwei Bran Li, David G. Ellis, Michele R. Aizenberg, Sergios Gatidis, Thomas Küstner, Nadya Shusharina, Nicholas Heller, Vincent Andrearczyk, Adrien Depeursinge, Mathieu Hatt, Anjany Sekuboyina, Maximilian Löffler, Hans Liebl, Reuben Dorent, Tom Vercauteren, Jonathan Shapey, Aaron Kujawa, Stefan Cornelissen, Patrick Langenhuizen, Achraf Ben-Hamadou, Ahmed Rekik, Sergi Pujades, Edmond Boyer, Federico Bolelli, Costantino Grana, Luca Lumetti, Hamidreza Salehi, Jun Ma, Yao Zhang, Ramtin Gharleghi, Susann Beier, Arcot Sowmya, Eduardo A. Garza-Villarreal, Thania Balducci, Diego Angeles-Valdez, Roberto Souza, Leticia Rittner, Richard Frayne, Yuanfeng Ji, Vincenzo Ferrari, Soumick Chatterjee, Florian Dubost, Stefanie Schreiber, Hendrik Mattern, Oliver Speck, Daniel Haehn, Christoph John, Andreas Nürnberger, João Pedrosa, Carlos Ferreira, Guilherme Aresta, António Cunha, Aurélio Campilho, Yannick Suter, Jose Garcia, Alain Lalande, Vicky Vandenbossche, Aline Van Oevelen, Kate Duquesne, Hamza Mekhzoum, Jef Vandemeulebroucke, Emmanuel Audenaert, Claudia Krebs, Timo Van Leeuwen, Evie Vereecke, Hauke Heidemeyer, Rainer Röhrig, Frank Hölzle, Vahid Badeli, Kathrin Krieger, Matthias Gunzer, Jianxu Chen, Timo van Meegdenburg, Amin Dada, Miriam Balzer, Jana Fragemann, Frederic Jonske, Moritz Rempe, Stanislav Malorodov, Fin H. Bahnsen, Constantin Seibold, Alexander Jaus, Zdravko Marinov, Paul F. Jaeger, Rainer Stiefelhagen, Ana Sofia Santos, Mariana Lindo, André Ferreira, Victor Alves, Michael Kamp, Amr Abourayya, Felix Nensa, Fabian Hörst, Alexander Brehmer, Lukas Heine, Yannik Hanusrichter, Martin Weßling, Marcel Dudda, Lars E. Podleska, Matthias A. Fink, Julius Keyl, Konstantinos Tserpes, Moon-Sung Kim, Shireen Elhabian, Hans Lamecker, Dženan Zukić, Beatriz Paniagua, Christian Wachinger, Martin Urschler, Luc Duong, Jakob Wasserthal, Peter F. Hoyer, Oliver Basu, Thomas Maal, Max J. H. Witjes, Gregor Schiele, Ti-chiun Chang, Seyed-Ahmad Ahmadi, Ping Luo, Bjoern Menze, Mauricio Reyes, Thomas M. Deserno, Christos Davatzikos, Behrus Puladi, Pascal Fua, Alan L. Yuille, Jens Kleesiek, Jan Egger
For the medical domain, we present a large collection of anatomical shapes (e. g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems.
1 code implementation • 23 Aug 2023 • Hejun Xiao, Kunyu Peng, Xiangsheng Huang, Alina Roitberg1, Hao Li, Zhaohui Wang, Rainer Stiefelhagen
In this paper, we introduce a privacy-supporting solution that makes the RGB-trained model applicable in depth domain and utilizes depth data at test time for fall detection.
no code implementations • 31 Jul 2023 • Walter Morales-Alvarez, Novel Certad, Alina Roitberg, Rainer Stiefelhagen, Cristina Olaverri-Monreal
For driver observation frameworks, clean datasets collected in controlled simulated environments often serve as the initial training ground.
2 code implementations • 28 Jul 2023 • Fei Teng, Jiaming Zhang, Kunyu Peng, Yaonan Wang, Rainer Stiefelhagen, Kailun Yang
To simultaneously streamline the redundant information from the light field cameras and avoid feature loss during network propagation, we present a simple yet very effective Sub-Aperture Fusion Module (SAFM).
1 code implementation • 25 Jul 2023 • Alexander Jaus, Constantin Seibold, Kelsey Hermann, Alexandra Walter, Kristina Giske, Johannes Haubold, Jens Kleesiek, Rainer Stiefelhagen
We examine its plausibility and usefulness using three complementary checks: Human expert evaluation which approved the dataset, a Deep Learning usefulness benchmark on the BTCV dataset in which we achieve 85% dice score without using its training dataset, and medical validity checks.
no code implementations • 15 Jul 2023 • Ke Cao, Ruiping Liu, Ze Wang, Kunyu Peng, Jiaming Zhang, Junwei Zheng, Zhifeng Teng, Kailun Yang, Rainer Stiefelhagen
On the other hand, the entire line segment detected by the visual subsystem overcomes the limitation of the LiDAR subsystem, which can only perform the local calculation for geometric features.
1 code implementation • 15 Jul 2023 • Ruiping Liu, Jiaming Zhang, Kunyu Peng, Junwei Zheng, Ke Cao, Yufan Chen, Kailun Yang, Rainer Stiefelhagen
Grounded Situation Recognition (GSR) is capable of recognizing and interpreting visual scenes in a contextually intuitive way, yielding salient activities (verbs) and the involved entities (roles) depicted in images.
1 code implementation • 5 Jul 2023 • Omar Moured, Jiaming Zhang, Alina Roitberg, Thorsten Schwarz, Rainer Stiefelhagen
The digitization of documents allows for wider accessibility and reproducibility.
1 code implementation • 6 Jun 2023 • Constantin Seibold, Alexander Jaus, Matthias A. Fink, Moon Kim, Simon Reiß, Ken Herrmann, Jens Kleesiek, Rainer Stiefelhagen
Results: Our resulting segmentation models demonstrated remarkable performance on CXR, with a high average model-annotator agreement between two radiologists with mIoU scores of 0. 93 and 0. 85 for frontal and lateral anatomy, while inter-annotator agreement remained at 0. 95 and 0. 83 mIoU.
1 code implementation • 22 May 2023 • Stéphane Vujasinović, Sebastian Bullinger, Stefan Becker, Norbert Scherer-Negenborn, Michael Arens, Rainer Stiefelhagen
We present READMem (Robust Embedding Association for a Diverse Memory), a modular framework for semi-automatic video object segmentation (sVOS) methods designed to handle unconstrained videos.
Semantic Segmentation Semi-Supervised Video Object Segmentation +1
2 code implementations • 15 May 2023 • Kunyu Peng, Di Wen, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg
In this work, we focus on Few-Shot Domain Adaptation for Activity Recognition (FSDA-AR), which leverages a very small amount of labeled target videos to achieve effective adaptation.
1 code implementation • 24 Mar 2023 • Hao Shi, Yu Li, Kailun Yang, Jiaming Zhang, Kunyu Peng, Alina Roitberg, Yaozu Ye, Huajian Ni, Kaiwei Wang, Rainer Stiefelhagen
This paper raises the new task of Fisheye Semantic Completion (FSC), where dense texture, structure, and semantics of a fisheye image are inferred even beyond the sensor field-of-view (FoV).
1 code implementation • 21 Mar 2023 • Zhifeng Teng, Jiaming Zhang, Kailun Yang, Kunyu Peng, Hao Shi, Simon Reiß, Ke Cao, Rainer Stiefelhagen
Seeing only a tiny part of the whole is not knowing the full circumstance.
no code implementations • 13 Mar 2023 • Zdravko Marinov, Rainer Stiefelhagen, Jens Kleesiek
To address this, we conduct a comparative study of existing guidance signals by training interactive models with different signals and parameter settings to identify crucial parameters for the model's design.
no code implementations • 13 Mar 2023 • Zdravko Marinov, Simon Reiß, David Kersting, Jens Kleesiek, Rainer Stiefelhagen
Positron Emission Tomography (PET) and Computer Tomography (CT) are routinely used together to detect tumors.
1 code implementation • 2 Mar 2023 • Kunyu Peng, David Schneider, Alina Roitberg, Kailun Yang, Jiaming Zhang, Chen Deng, Kaiyu Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen
To this intent, we provide the MuscleMap dataset featuring >15K video clips with 135 different activities and 20 labeled muscle groups.
1 code implementation • CVPR 2023 • Jiaming Zhang, Ruiping Liu, Hao Shi, Kailun Yang, Simon Reiß, Kunyu Peng, Haodong Fu, Kaiwei Wang, Rainer Stiefelhagen
To make this possible, we present the arbitrary cross-modal segmentation model CMNeXt.
Ranked #1 on Semantic Segmentation on Porto
1 code implementation • 28 Feb 2023 • Junwei Zheng, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen
People with Visual Impairments (PVI) typically recognize objects through haptic perception.
1 code implementation • 24 Jan 2023 • Verena Jasmin Hallitschke, Tobias Schlumberger, Philipp Kataliakos, Zdravko Marinov, Moon Kim, Lars Heiliger, Constantin Seibold, Jens Kleesiek, Rainer Stiefelhagen
Recently, deep learning enabled the accurate segmentation of various diseases in medical imaging.
1 code implementation • CVPR 2023 • Simon Reiß, Constantin Seibold, Alexander Freytag, Erik Rodner, Rainer Stiefelhagen
A vast amount of images and pixel-wise annotations allowed our community to build scalable segmentation solutions for natural domains.
no code implementations • CVPR 2023 • Florian Fervers, Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen
This paper proposes a novel method for vision-based metric cross-view geolocalization (CVGL) that matches the camera images captured from a ground-based vehicle with an aerial image to determine the vehicle's geo-pose.
1 code implementation • 23 Oct 2022 • Zeyun Zhong, David Schneider, Michael Voit, Rainer Stiefelhagen, Jürgen Beyerer
Although human action anticipation is a task which is inherently multi-modal, state-of-the-art methods on well known action anticipation datasets leverage this data by applying ensemble methods and averaging scores of unimodal anticipation networks.
Ranked #3 on Action Anticipation on EPIC-KITCHENS-100 (test)
no code implementations • 7 Oct 2022 • Constantin Seibold, Simon Reiß, Saquib Sarfraz, Matthias A. Fink, Victoria Mayer, Jan Sellner, Moon Sung Kim, Klaus H. Maier-Hein, Jens Kleesiek, Rainer Stiefelhagen
To exploit anatomical structures in this scenario, we present a sophisticated automatic pipeline to gather and integrate human bodily structures from computed tomography datasets, which we incorporate in our PAXRay: A Projected dataset for the segmentation of Anatomical structures in X-Ray data.
no code implementations • 2 Sep 2022 • Lars Heiliger, Zdravko Marinov, Max Hasin, André Ferreira, Jana Fragemann, Kelsey Pomykala, Jacob Murray, David Kersting, Victor Alves, Rainer Stiefelhagen, Jan Egger, Jens Kleesiek
Tumor volume and changes in tumor characteristics over time are important biomarkers for cancer therapy.
no code implementations • 19 Aug 2022 • Zdravko Marinov, Alina Roitberg, David Schneider, Rainer Stiefelhagen
Modality selection is an important step when designing multimodal systems, especially in the case of cross-domain activity recognition as certain modalities are more robust to domain shift than others.
1 code implementation • 3 Aug 2022 • Zdravko Marinov, David Schneider, Alina Roitberg, Rainer Stiefelhagen
We tackle this challenge and introduce an activity domain generation framework which creates novel ADL appearances (novel domains) from different existing activity modalities (source domains) inferred from video training data.
1 code implementation • 25 Jul 2022 • Jiaming Zhang, Kailun Yang, Hao Shi, Simon Reiß, Kunyu Peng, Chaoxiang Ma, Haodong Fu, Philip H. S. Torr, Kaiwei Wang, Rainer Stiefelhagen
In this paper, we address panoramic semantic segmentation which is under-explored due to two critical challenges: (1) image distortions and object deformations on panoramas; (2) lack of semantic annotations in the 360{\deg} imagery.
Ranked #1 on Semantic Segmentation on SynPASS
1 code implementation • 13 Jul 2022 • Chang Chen, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen
Humans have an innate ability to sense their surroundings, as they can extract the spatial representation from the egocentric perception and form an allocentric semantic map via spatial transformation and memory updating.
1 code implementation • 13 Jul 2022 • Ping-Cheng Wei, Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen
Failure to timely diagnose and effectively treat depression leads to over 280 million people suffering from this psychological disorder worldwide.
1 code implementation • 21 Jun 2022 • Alexander Jaus, Kailun Yang, Rainer Stiefelhagen
In order to overcome the lack of annotated panoramic images, we propose a framework which allows model training on standard pinhole images and transfers the learned features to the panoramic domain in a cost-minimizing way.
no code implementations • 14 May 2022 • Constantin Seibold, Simon Reiß, M. Saquib Sarfraz, Rainer Stiefelhagen, Jens Kleesiek
We show that despite using unstructured medical report supervision, we perform on par with direct label supervision through a sophisticated inference setting.
Ranked #2 on Thoracic Disease Classification on ChestX-ray14
no code implementations • 29 Apr 2022 • Lukas Scholch, Jonas Steinhauser, Maximilian Beichter, Constantin Seibold, Kailun Yang, Merlin Knäble, Thorsten Schwarz, Alexander Mädche, Rainer Stiefelhagen
In this work, we propose a synthetic dataset, containing SVCs in the form of images as well as ground truths.
no code implementations • 10 Apr 2022 • Alina Roitberg, Kunyu Peng, David Schneider, Kailun Yang, Marios Koulakis, Manuel Martinez, Rainer Stiefelhagen
In this work, we for the first time examine how well the confidence values of modern driver observation models indeed match the probability of the correct outcome and show that raw neural network-based approaches tend to significantly overestimate their prediction quality.
no code implementations • 10 Apr 2022 • Alina Roitberg, Kunyu Peng, Zdravko Marinov, Constantin Seibold, David Schneider, Rainer Stiefelhagen
Visual recognition inside the vehicle cabin leads to safer driving and more intuitive human-vehicle interaction but such systems face substantial obstacles as they need to capture different granularities of driver behaviour while dealing with highly limited body visibility and changing illumination.
no code implementations • 3 Apr 2022 • Wenyan Ou, Jiaming Zhang, Kunyu Peng, Kailun Yang, Gerhard Jaworek, Karin Müller, Rainer Stiefelhagen
Then, poses and speed of tracked dynamic objects can be estimated, which are passed to the users through acoustic feedback.
1 code implementation • CVPR 2022 • M. Saquib Sarfraz, Marios Koulakis, Constantin Seibold, Rainer Stiefelhagen
Dimensionality reduction is crucial both for visualization and preprocessing high dimensional data for machine learning.
Ranked #3 on Data Augmentation on GA1457
1 code implementation • 19 Mar 2022 • Xinyu Luo, Jiaming Zhang, Kailun Yang, Alina Roitberg, Kunyu Peng, Rainer Stiefelhagen
Autonomous vehicles utilize urban scene segmentation to understand the real world like a human and react accordingly.
Ranked #1 on Semantic Segmentation on DADA-seg (using extra training data)
1 code implementation • 17 Mar 2022 • Qing Wang, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen
While detector-based methods coupled with feature descriptors struggle in low-texture scenes, CNN-based methods with a sequential extract-to-match pipeline, fail to make use of the matching capacity of the encoder and tend to overburden the decoder for matching.
1 code implementation • 9 Mar 2022 • Jiaming Zhang, Huayao Liu, Kailun Yang, Xinxin Hu, Ruiping Liu, Rainer Stiefelhagen
Pixel-wise semantic segmentation of RGB images can be advanced by exploiting complementary features from the supplementary modality (X-modality).
Ranked #1 on Image Manipulation Localization on CocoGlide
no code implementations • 7 Mar 2022 • Florian Fervers, Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen
Our method is the first to utilize on-board cameras in an end-to-end differentiable model for metric self-localization on unseen orthophotos.
1 code implementation • 3 Mar 2022 • Stephane Vujasinovic, Sebastian Bullinger, Stefan Becker, Norbert Scherer-Negenborn, Michael Arens, Rainer Stiefelhagen
While current methods for interactive Video Object Segmentation (iVOS) rely on scribble-based interactions to generate precise object masks, we propose a Click-based interactive Video Object Segmentation (CiVOS) framework to simplify the required user workload as much as possible.
1 code implementation • 2 Mar 2022 • Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen
This module operates in the latent feature-space enriching and diversifying the training set at feature-level in order to improve generalization to novel data appearances, (e. g., sensor changes) and general feature quality.
1 code implementation • CVPR 2022 • Jiaming Zhang, Kailun Yang, Chaoxiang Ma, Simon Reiß, Kunyu Peng, Rainer Stiefelhagen
To get around this domain difference and bring together semantic annotations from pinhole- and 360-degree surround-visuals, we propose to learn object deformations and panoramic image distortions in the Deformable Patch Embedding (DPE) and Deformable MLP (DMLP) components which blend into our Transformer for PAnoramic Semantic Segmentation (Trans4PASS) model.
Ranked #2 on Semantic Segmentation on SynPASS
2 code implementations • 27 Feb 2022 • Ruiping Liu, Kailun Yang, Alina Roitberg, Jiaming Zhang, Kunyu Peng, Huayao Liu, Yaonan Wang, Rainer Stiefelhagen
Furthermore, we introduce two optimization modules to enhance the patch embedding distillation from different perspectives: (1) Global-Local Context Mixer (GL-Mixer) extracts both global and local information of a representative embedding; (2) Embedding Assistant (EA) acts as an embedding method to seamlessly bridge teacher and student models with the teacher's number of channels.
2 code implementations • 23 Feb 2022 • Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen
Yet, the research of data-scarce recognition from skeleton sequences, such as one-shot action recognition, does not explicitly consider occlusions despite their everyday pervasiveness.
Ranked #1 on Action Classification on Toyota Smarthome dataset (Accuracy metric)
1 code implementation • 1 Feb 2022 • Kunyu Peng, Alina Roitberg, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen
To study this underresearched task, we introduce Vid2Burn -- an omni-source benchmark for estimating caloric expenditure from video data featuring both, high- and low-intensity activities for which we derive energy expenditure annotations based on models established in medical literature.
1 code implementation • 9 Dec 2021 • Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen
Moreover, in order to evaluate the segmentation performance in traffic accidents, we provide a pixel-wise annotated accident dataset, namely DADA-seg, which contains a variety of critical scenarios from traffic accidents.
Ranked #3 on Semantic Segmentation on DADA-seg (using extra training data)
no code implementations • 1 Dec 2021 • Constantin Seibold, Simon Reiß, Jens Kleesiek, Rainer Stiefelhagen
Following this thought, we use a small number of labeled images as reference material and match pixels in an unlabeled image to the semantics of the best fitting pixel in a reference set.
1 code implementation • 30 Nov 2021 • Kunyu Peng, Alina Roitberg, David Schneider, Marios Koulakis, Kailun Yang, Rainer Stiefelhagen
Human affect recognition is a well-established research area with numerous applications, e. g., in psychological care, but existing methods assume that all emotions-of-interest are given a priori as annotated training examples.
1 code implementation • 3 Nov 2021 • Tobias Ringwald, Rainer Stiefelhagen
Unsupervised domain adaptation (UDA) deals with the problem of classifying unlabeled target domain data while labeled data is only available for a different source domain.
1 code implementation • 22 Oct 2021 • Tobias Ringwald, Rainer Stiefelhagen
Unsupervised domain adaptation (UDA) deals with the adaptation process of a model to an unlabeled target domain while annotated data is only available for a given source domain.
1 code implementation • 21 Oct 2021 • Jiaming Zhang, Chaoxiang Ma, Kailun Yang, Alina Roitberg, Kunyu Peng, Rainer Stiefelhagen
We look at this problem from the perspective of domain adaptation and bring panoramic semantic segmentation to a setting, where labelled training data originates from a different distribution of conventional pinhole camera images.
Ranked #7 on Semantic Segmentation on DensePASS (using extra training data)
1 code implementation • 20 Aug 2021 • Jiaming Zhang, Kailun Yang, Angela Constantinescu, Kunyu Peng, Karin Müller, Rainer Stiefelhagen
In this paper, we build a wearable system with a novel dual-head Transformer for Transparency (Trans4Trans) perception model, which can segment general- and transparent objects.
Ranked #2 on Semantic Segmentation on DADA-seg (using extra training data)
1 code implementation • 16 Aug 2021 • Haobin Tan, Chang Chen, Xinyu Luo, Jiaming Zhang, Constantin Seibold, Kailun Yang, Rainer Stiefelhagen
By recognizing the color of pedestrian traffic lights, our prototype can help the user to cross a street safely.
1 code implementation • 13 Aug 2021 • Chaoxiang Ma, Jiaming Zhang, Kailun Yang, Alina Roitberg, Rainer Stiefelhagen
First, we formalize the task of unsupervised domain adaptation for panoramic semantic segmentation, where a network trained on labelled examples from the source domain of pinhole camera data is deployed in a different target domain of panoramic images, for which no labels are available.
1 code implementation • 12 Jul 2021 • Alina Roitberg, David Schneider, Aulia Djamal, Constantin Seibold, Simon Reiß, Rainer Stiefelhagen
Recognizing Activities of Daily Living (ADL) is a vital process for intelligent assistive robots, but collecting large annotated datasets requires time-consuming temporal labeling and raises privacy concerns, e. g., if the data is collected in a real household.
1 code implementation • 7 Jul 2021 • Jiaming Zhang, Kailun Yang, Angela Constantinescu, Kunyu Peng, Karin Müller, Rainer Stiefelhagen
Common fully glazed facades and transparent objects present architectural barriers and impede the mobility of people with low vision or blindness, for instance, a path detected behind a glass door is inaccessible unless it is correctly perceived and reacted.
Ranked #1 on Semantic Segmentation on Trans10K
no code implementations • 7 Jul 2021 • Huayao Liu, Ruiping Liu, Kailun Yang, Jiaming Zhang, Kunyu Peng, Rainer Stiefelhagen
To tackle these issues, we propose HIDA, a lightweight assistive system based on 3D point cloud instance segmentation with a solid-state LiDAR sensor, for holistic indoor detection and avoidance.
Ranked #19 on 3D Instance Segmentation on ScanNet(v2)
1 code implementation • 1 Jul 2021 • Kunyu Peng, Juncong Fei, Kailun Yang, Alina Roitberg, Jiaming Zhang, Frank Bieder, Philipp Heidenreich, Christoph Stiller, Rainer Stiefelhagen
At the heart of all automated driving systems is the ability to sense the surroundings, e. g., through semantic segmentation of LiDAR sequences, which experienced a remarkable progress due to the release of large datasets such as SemanticKITTI and nuScenes-LidarSeg.
1 code implementation • 27 May 2021 • Zdravko Marinov, Stanka Vasileva, Qing Wang, Constantin Seibold, Jiaming Zhang, Rainer Stiefelhagen
Our framework provides the functionality to control the movement of the drone with simple arm gestures and to follow the user while keeping a safe distance.
no code implementations • CVPR 2021 • Simon Reiß, Constantin Seibold, Alexander Freytag, Erik Rodner, Rainer Stiefelhagen
Pixel-wise segmentation is one of the most data and annotation hungry tasks in our field.
1 code implementation • CVPR 2021 • M. Saquib Sarfraz, Naila Murray, Vivek Sharma, Ali Diba, Luc van Gool, Rainer Stiefelhagen
Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos and is an important requirement for many video understanding tasks.
Ranked #1 on Action Segmentation on MPII Cooking 2 Dataset
1 code implementation • CVPR 2021 • Kailun Yang, Jiaming Zhang, Simon Reiß, Xinxin Hu, Rainer Stiefelhagen
Convolutional Networks (ConvNets) excel at semantic segmentation and have become a vital component for perception in autonomous driving.
Ranked #10 on Semantic Segmentation on DensePASS (using extra training data)
no code implementations • 6 Mar 2021 • Yingzhi Zhang, Haoye Chen, Kailun Yang, Jiaming Zhang, Rainer Stiefelhagen
As the scene information, including objectness and scene type, are important for people with visual impairment, in this work we present a multi-task efficient perception system for the scene parsing and recognition tasks.
1 code implementation • 6 Mar 2021 • Wei Mao, Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen
Based on Lintention, we then devise a novel panoptic segmentation model which we term Panoptic Lintention Net.
1 code implementation • 1 Mar 2021 • Alexander Jaus, Kailun Yang, Rainer Stiefelhagen
In order to overcome the lack of annotated panoramic images, we propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain.
1 code implementation • 1 Mar 2021 • Shuo Chen, Kailun Yang, Rainer Stiefelhagen
Street scene change detection continues to capture researchers' interests in the computer vision community.
no code implementations • 2 Feb 2021 • Constantin Seibold, Matthias A. Fink, Charlotte Goos, Hans-Ulrich Kauczor, Heinz-Peter Schlemmer, Rainer Stiefelhagen, Jens Kleesiek
Detector-based spectral computed tomography is a recent dual-energy CT (DECT) technology that offers the possibility of obtaining spectral information.
1 code implementation • WACV 2021 • Tobias Ringwald, Rainer Stiefelhagen
Unsupervised domain adaptation (UDA) deals with the adaptation process of a given source domain with labeled training data to a target domain for which only unannotated data is available.
no code implementations • 2 Jan 2021 • Alina Roitberg, Monica Haurilet, Manuel Martinez, Rainer Stiefelhagen
While temperature scaling alone drastically improves the reliability of the confidence values, our CARING method consistently leads to the best uncertainty estimates in all benchmark settings.
no code implementations • ICCV 2021 • Ali Diba, Vivek Sharma, Reza Safdari, Dariush Lotfi, Saquib Sarfraz, Rainer Stiefelhagen, Luc van Gool
In this paper, we introduce a novel self-supervised visual representation learning method which understands both images and videos in a joint learning fashion.
no code implementations • 2 Nov 2020 • Robin Ruede, Verena Heusser, Lukas Frank, Alina Roitberg, Monica Haurilet, Rainer Stiefelhagen
Our experiments demonstrate clear benefits of multi-task learning for calorie estimation, surpassing the single-task calorie regression by 9. 9%.
1 code implementation • 30 Sep 2020 • Constantin Seibold, Jens Kleesiek, Heinz-Peter Schlemmer, Rainer Stiefelhagen
In this paper, we address the problem of weakly supervised identification and localization of abnormalities in chest radiographs.
1 code implementation • 14 Sep 2020 • Tobias Ringwald, Rainer Stiefelhagen
Unsupervised domain adaptation (UDA) deals with the adaptation of models from a given source domain with labeled data to an unlabeled target domain.
1 code implementation • 20 Aug 2020 • Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen
Ensuring the safety of all traffic participants is a prerequisite for bringing intelligent vehicles closer to practical applications.
Ranked #6 on Semantic Segmentation on KITTI-360
1 code implementation • 19 Aug 2020 • Alexander Wolpert, Michael Teutsch, M. Saquib Sarfraz, Rainer Stiefelhagen
In this way, we can both simplify the network architecture and achieve higher detection performance, especially for pedestrians under occlusion or at low object resolution.
no code implementations • 20 Jul 2020 • Wei Mao, Jiaming Zhang, Kailun Yang, Rainer Stiefelhagen
Navigational perception for visually impaired people has been substantially promoted by both classic and deep learning based segmentation methods.
no code implementations • 25 Apr 2020 • Amine Kechaou, Manuel Martinez, Monica Haurilet, Rainer Stiefelhagen
At each iteration, our decoder focuses on the relevant parts of the image using an attention mechanism, and then estimates the object's class and the bounding box coordinates.
1 code implementation • 5 Apr 2020 • Vivek Sharma, Makarand Tapaswi, Rainer Stiefelhagen
True understanding of videos comes from a joint analysis of all its modalities: the video frames, the audio track, and any accompanying text such as closed captions.
no code implementations • 5 Apr 2020 • Vivek Sharma, Makarand Tapaswi, M. Saquib Sarfraz, Rainer Stiefelhagen
We demonstrate our method on the challenging task of learning representations for video face clustering.
1 code implementation • 17 Sep 2019 • Kailun Yang, Xinxin Hu, Hao Chen, Kaite Xiang, Kaiwei Wang, Rainer Stiefelhagen
Semantically interpreting the traffic scene is crucial for autonomous transportation and robotics systems.
Ranked #35 on Semantic Segmentation on DensePASS
1 code implementation • 1 Aug 2019 • M. Saquib Sarfraz, Constantin Seibold, Haroon Khalid, Rainer Stiefelhagen
In this paper, we propose a novel method of computing the loss directly between the source and target images that enable proper distillation of shape/content and colour/style.
no code implementations • ICCV 2019 • Ali Diba, Vivek Sharma, Luc van Gool, Rainer Stiefelhagen
With these overall objectives, to this end, we introduce a novel unified spatio-temporal 3D-CNN architecture (DynamoNet) that jointly optimizes the video classification and learning motion representation by predicting future frames as a multi-task learning problem.
1 code implementation • ECCV 2020 • Ali Diba, Mohsen Fayyaz, Vivek Sharma, Manohar Paluri, Jurgen Gall, Rainer Stiefelhagen, Luc van Gool
HVU is organized hierarchically in a semantic taxonomy that focuses on multi-label and multi-task video understanding as a comprehensive problem that encompasses the recognition of multiple semantic aspects in the dynamic scene.
Ranked #13 on Action Recognition on UCF101
1 code implementation • 3 Mar 2019 • Vivek Sharma, Makarand Tapaswi, M. Saquib Sarfraz, Rainer Stiefelhagen
In this paper, we address video face clustering using unsupervised methods.
1 code implementation • 28 Feb 2019 • M. Saquib Sarfraz, Vivek Sharma, Rainer Stiefelhagen
We present a new clustering method in the form of a single clustering equation that is able to directly discover groupings in the data.
no code implementations • 27 Dec 2018 • Congcong Wang, Vivek Sharma, Yu Fan, Faouzi Alaya Cheikh, Azeddine Beghdadi, Ole Jacob Elle, Rainer Stiefelhagen
For feature extraction, we use statistical features based on bivariate histogram distribution of gradient magnitude~(GM) and Laplacian of Gaussian~(LoG).
no code implementations • 30 Oct 2018 • Alina Roitberg, Ziad Al-Halah, Rainer Stiefelhagen
While it is common in activity recognition to assume a closed-set setting, i. e. test samples are always of training categories, this assumption is impractical in a real-world scenario.
no code implementations • 11 Oct 2018 • Manuel Martinez, Rainer Stiefelhagen
We present the Tamed Cross Entropy (TCE) loss function, a robust derivative of the standard Cross Entropy (CE) loss used in deep learning for classification tasks.
no code implementations • ECCV 2018 • Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen
We apply Structure from Motion techniques to vehicle and background images to determine for each frame camera poses relative to vehicle instances and background structures.
no code implementations • 27 Aug 2018 • Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen
We compute the object trajectory by combining object and background camera pose information.
no code implementations • CVPR 2018 • Vivek Sharma, Ali Diba, Davy Neven, Michael S. Brown, Luc van Gool, Rainer Stiefelhagen
In this paper, we are interested in learning CNNs that can emulate image enhancement and restoration, but with the overall goal to improve image classification and not necessarily human perception.
2 code implementations • CVPR 2018 • M. Saquib Sarfraz, Arne Schumann, Andreas Eberle, Rainer Stiefelhagen
In contrast to the recent direction of explicitly modeling body parts or correcting for misalignment based on these, we show that a rather straightforward inclusion of acquired camera view and/or the detected joint locations into a convolutional neural network helps to learn a very effective representation.
Ranked #17 on Person Re-Identification on MARS
no code implementations • 22 Nov 2017 • Ali Diba, Vivek Sharma, Rainer Stiefelhagen, Luc van Gool
We approach GANs with a novel training method and learning objective, to discover multiple object instances for three cases: 1) synthesizing a picture of a specific object within a cluttered scene; 2) localizing different categories in images for weakly supervised object detection; and 3) improving object discov- ery in object detection pipelines.
Ranked #2 on Weakly Supervised Object Detection on COCO test-dev
no code implementations • 16 Nov 2017 • Sebastian Bullinger, Christoph Bodensteiner, Michael Arens, Rainer Stiefelhagen
We apply Structure from Motion techniques to object and background images to determine for each frame camera poses relative to object instances and background structures.