no code implementations • 15 Apr 2024 • Julian Lorenz, Robin Schön, Katja Ludwig, Rainer Lienhart
Scene graph generation has emerged as a prominent research field in computer vision, witnessing significant advancements in the recent years.
no code implementations • 12 Apr 2024 • Robin Schön, Julian Lorenz, Katja Ludwig, Rainer Lienhart
The interactive segmentation task consists in the creation of object segmentation masks based on user interactions.
1 code implementation • 26 Oct 2023 • Daniel Kienzle, Julian Lorenz, Katja Ludwig, Rainer Lienhart
We present a novel method for precise 3D object localization in single images from a single calibrated camera using only 2D labels.
no code implementations • 5 Sep 2023 • Julian Lorenz, Florian Barthel, Daniel Kienzle, Rainer Lienhart
We construct a new panoptic scene graph dataset and a set of metrics that are designed as a benchmark for the predictive performance especially on rare predicate classes.
1 code implementation • 18 Jun 2023 • Luuk H. Boulogne, Julian Lorenz, Daniel Kienzle, Robin Schon, Katja Ludwig, Rainer Lienhart, Simon Jegou, Guang Li, Cong Chen, Qi Wang, Derik Shi, Mayug Maniparambil, Dominik Muller, Silvan Mertes, Niklas Schroter, Fabio Hellmann, Miriam Elia, Ine Dirks, Matias Nicolas Bossa, Abel Diaz Berenguer, Tanmoy Mukherjee, Jef Vandemeulebroucke, Hichem Sahli, Nikos Deligiannis, Panagiotis Gonidakis, Ngoc Dung Huynh, Imran Razzak, Reda Bouadjenek, Mario Verdicchio, Pasquale Borrelli, Marco Aiello, James A. Meakin, Alexander Lemm, Christoph Russ, Razvan Ionasec, Nikos Paragios, Bram van Ginneken, Marie-Pierre Revel Dubois
STOIC2021 consisted of a Qualification phase, where participants developed challenge solutions using 2000 publicly available CT scans, and a Final phase, where participants submitted their training methodologies with which solutions were trained on CT scans of 9724 subjects.
no code implementations • 12 Apr 2023 • Robin Schön, Katja Ludwig, Rainer Lienhart
In order to tell our network which object to segment, we provide the network with a single click on the object's surface on the pseudo depth map of the image as input.
1 code implementation • 6 Apr 2023 • Katja Ludwig, Julian Lorenz, Robin Schön, Rainer Lienhart
Performance analyses based on videos are commonly used by coaches of athletes in various sports disciplines.
1 code implementation • 17 Nov 2022 • Katja Ludwig, Daniel Kienzle, Julian Lorenz, Rainer Lienhart
We analyze different training techniques for freely selected and standard keypoints, including pseudo labels, and show in our experiments that only a few partly correct segmentation masks are sufficient for learning to detect arbitrary keypoints on limbs and skis.
1 code implementation • 19 Oct 2022 • Sebastian Scherer, Robin Schön, Rainer Lienhart
Current SSL approaches use an initially supervised trained model to generate predictions for unlabelled images, called pseudo-labels, which are subsequently used for training a new model from scratch.
1 code implementation • 12 Oct 2022 • Moritz Einfalt, Katja Ludwig, Rainer Lienhart
The state-of-the-art for monocular 3D human pose estimation in videos is dominated by the paradigm of 2D-to-3D pose uplifting.
1 code implementation • 30 Jun 2022 • Daniel Kienzle, Julian Lorenz, Robin Schön, Katja Ludwig, Rainer Lienhart
We introduce a neural network for the prediction of the severity of lung damage and the detection of a COVID-infection using three-dimensional CT-data.
no code implementations • 13 Apr 2022 • Katja Ludwig, Daniel Kienzle, Rainer Lienhart
Nearly all Human Pose Estimation (HPE) datasets consist of a fixed set of keypoints.
no code implementations • 28 Dec 2021 • Philipp Harzig, Moritz Einfalt, Katja Ludwig, Rainer Lienhart
For both models, we train on the complete VATEX dataset and 90% of the TRECVID-VTT dataset for pretraining while using the remaining 10% for validation.
no code implementations • 28 Dec 2021 • Philipp Harzig, Moritz Einfalt, Rainer Lienhart
Video-to-Text (VTT) is the task of automatically generating descriptions for short audio-visual video clips, which can support visually impaired people to understand scenes of a YouTube video for instance.
no code implementations • 5 Nov 2021 • Stephan Brehm, Sebastian Scherer, Rainer Lienhart
Unsupervised Domain Adaptation (UDA) aims to adapt models trained on a source domain to a new target domain where no labelled data is available.
no code implementations • 23 Oct 2020 • Nikolas Klug, Moritz Einfalt, Stephan Brehm, Rainer Lienhart
Our paper thus establishes a theoretical baseline that shows the importance of suitable projection models in weakly supervised 3D human pose estimation.
no code implementations • 21 Apr 2020 • Moritz Einfalt, Rainer Lienhart
In this paper we address the problem of motion event detection in athlete recordings from individual sports.
no code implementations • 6 Aug 2019 • Philipp Harzig, Yan-Ying Chen, Francine Chen, Rainer Lienhart
Automatic medical report generation from chest X-ray images is one possibility for assisting doctors to reduce their workload.
1 code implementation • 6 May 2019 • Philipp Harzig, Dan Zecha, Rainer Lienhart, Carolin Kaiser, René Schallner
Furthermore, we introduce a novel metric that allows us to assess whether the generated captions meet our requirements (i. e., subject, predicate, object, and product name) and describe a series of experiments on caption quality and how to address annotator disagreements for the image ratings with an approach called soft targets.
no code implementations • 24 Apr 2018 • Rainer Lienhart, Moritz Einfalt, Dan Zecha
Human pose detection systems based on state-of-the-art DNNs are on the go to be extended, adapted and re-trained to fit the application domain of specific sports.
no code implementations • 6 Feb 2018 • Philipp Harzig, Stephan Brehm, Rainer Lienhart, Carolin Kaiser, René Schallner
Thanks to adding the third output modality, it also considerably improves the quality of generated captions for images depicting branded products.
no code implementations • 2 Feb 2018 • Moritz Einfalt, Dan Zecha, Rainer Lienhart
Our main contributions are threefold: (a) We apply and evaluate a fine-tuned Convolutional Pose Machine architecture as a baseline in our very challenging aquatic environment and discuss its error modes, (b) we propose an extension to input swimming style information into the fully convolutional architecture and (c) modify the architecture for continuous pose estimation in videos.
no code implementations • 28 Apr 2017 • Christian Eggert, Dan Zecha, Stephan Brehm, Rainer Lienhart
Many modern approaches for object detection are two-staged pipelines.
no code implementations • 14 Mar 2016 • Anton Winschel, Rainer Lienhart, Christian Eggert
Current top performing object recognition systems build on object proposals as a preprocessing step.
no code implementations • 21 Apr 2015 • Dan Zecha, Rainer Lienhart
In this paper we study the problem of estimating innercyclic time intervals within repetitive motion sequences of top-class swimmers in a swimming channel.
no code implementations • ACM International Conference on Multimedia Retrieval 2011 • Stefan Romberg, Lluis Garcia Pueyo, Rainer Lienhart, Roelof van Zwol
In this paper we propose a highly effective and scalable framework for recognizing logos in images.