1 code implementation • 12 Oct 2022 • Moritz Einfalt, Katja Ludwig, Rainer Lienhart
The state-of-the-art for monocular 3D human pose estimation in videos is dominated by the paradigm of 2D-to-3D pose uplifting.
no code implementations • 28 Dec 2021 • Philipp Harzig, Moritz Einfalt, Rainer Lienhart
Video-to-Text (VTT) is the task of automatically generating descriptions for short audio-visual video clips, which can support visually impaired people to understand scenes of a YouTube video for instance.
no code implementations • 28 Dec 2021 • Philipp Harzig, Moritz Einfalt, Katja Ludwig, Rainer Lienhart
For both models, we train on the complete VATEX dataset and 90% of the TRECVID-VTT dataset for pretraining while using the remaining 10% for validation.
no code implementations • 23 Oct 2020 • Nikolas Klug, Moritz Einfalt, Stephan Brehm, Rainer Lienhart
Our paper thus establishes a theoretical baseline that shows the importance of suitable projection models in weakly supervised 3D human pose estimation.
no code implementations • 21 Apr 2020 • Moritz Einfalt, Rainer Lienhart
In this paper we address the problem of motion event detection in athlete recordings from individual sports.
no code implementations • 24 Apr 2018 • Rainer Lienhart, Moritz Einfalt, Dan Zecha
Human pose detection systems based on state-of-the-art DNNs are on the go to be extended, adapted and re-trained to fit the application domain of specific sports.
no code implementations • 2 Feb 2018 • Moritz Einfalt, Dan Zecha, Rainer Lienhart
Our main contributions are threefold: (a) We apply and evaluate a fine-tuned Convolutional Pose Machine architecture as a baseline in our very challenging aquatic environment and discuss its error modes, (b) we propose an extension to input swimming style information into the fully convolutional architecture and (c) modify the architecture for continuous pose estimation in videos.