no code implementations • 23 Jan 2025 • Huilin Yin, Yangwenhui Xu, Jiaxiang Li, Hao Zhang, Gerhard Rigoll
To the best of our knowledge, I2XTraj represents the first multi-agent trajectory prediction framework explicitly designed for infrastructure deployment, supplying subscribable prediction services to all vehicles at intersections.
no code implementations • 29 Nov 2024 • Philipp Wolters, Johannes Gilg, Torben Teepe, Fabian Herzog, Felix Fent, Gerhard Rigoll
In this work, we present SpaRC, a novel Sparse fusion transformer for 3D perception that integrates multi-view image semantics with Radar and Camera point features.
Ranked #1 on
3D Object Detection
on TruckScenes
1 code implementation • 3 Oct 2024 • Fabian Herzog, Johannes Gilg, Philipp Wolters, Torben Teepe, Gerhard Rigoll
By keeping sparse appearance and positional cues of all detections in a cluster, our method can compare clusters based on the strongest available evidence.
1 code implementation • 19 Mar 2024 • Torben Teepe, Philipp Wolters, Johannes Gilg, Fabian Herzog, Gerhard Rigoll
Taking advantage of multi-view aggregation presents a promising solution to tackle challenges such as occlusion and missed detection in multi-object tracking and detection.
Ranked #1 on
Multi-Object Tracking
on MultiviewX
1 code implementation • 12 Mar 2024 • Philipp Wolters, Johannes Gilg, Torben Teepe, Fabian Herzog, Anouar Laouichi, Martin Hofmann, Gerhard Rigoll
HyDRa achieves a new state-of-the-art for camera-radar fusion of 64. 2 NDS (+1. 8) and 58. 4 AMOTA (+1. 5) on the public nuScenes dataset.
Ranked #1 on
3D Object Detection
on View-of-Delft (val)
1 code implementation • 20 Oct 2023 • Torben Teepe, Philipp Wolters, Johannes Gilg, Fabian Herzog, Gerhard Rigoll
Most current approaches in multi-view tracking perform the detection and tracking task in each view and use graph-based approaches to perform the association of the pedestrian across each view.
Ranked #2 on
Multi-Object Tracking
on MultiviewX
1 code implementation • 6 Sep 2023 • Johannes Gilg, Torben Teepe, Fabian Herzog, Philipp Wolters, Gerhard Rigoll
Object detectors are at the heart of many semi- and fully autonomous decision systems and are poised to become even more indispensable.
1 code implementation • 17 Apr 2023 • Martin Knoche, Gerhard Rigoll
Finally, we demonstrate that combining machine and human decisions can further improve the performance of state-of-the-art face verification systems on various benchmark datasets.
1 code implementation • 24 Nov 2022 • Martin Knoche, Torben Teepe, Stefan Hörmann, Gerhard Rigoll
This work focuses on explanations for face recognition systems, vital for developers and operators.
1 code implementation • 30 Aug 2022 • Fabian Herzog, Junpeng Chen, Torben Teepe, Johannes Gilg, Stefan Hörmann, Gerhard Rigoll
Smart City applications such as intelligent traffic routing or accident prevention rely on computer vision methods for exact vehicle localization and tracking.
Ranked #1 on
Multi-Object Tracking
on Synthehicle
2 code implementations • 14 Jul 2022 • Martin Knoche, Mohamed Elkadeem, Stefan Hörmann, Gerhard Rigoll
To address this problem, we propose a novel combination of the popular triplet loss to improve robustness against image resolution via fine-tuning of existing face recognition models.
Ranked #1 on
Face Recognition
on XQLFW
1 code implementation • 8 Jun 2022 • Jun Yan, Huilin Yin, Xiaoyang Deng, Ziming Zhao, Wancheng Ge, Hao Zhang, Gerhard Rigoll
Since adversarial vulnerability can be regarded as a high-frequency phenomenon, it is essential to regulate the adversarially-trained neural network models in the frequency domain.
no code implementations • 27 May 2022 • Stefan Hörmann, Tianlin Kong, Torben Teepe, Fabian Herzog, Martin Knoche, Gerhard Rigoll
State-of-the-art face recognition (FR) approaches have shown remarkable results in predicting whether two faces belong to the same identity, yielding accuracies between 92% and 100% depending on the difficulty of the protocol.
2 code implementations • 16 Apr 2022 • Torben Teepe, Johannes Gilg, Fabian Herzog, Stefan Hörmann, Gerhard Rigoll
Gait recognition is a promising biometric with unique properties for identifying individuals from a long distance by their walking patterns.
1 code implementation • 3 Dec 2021 • Johannes Gilg, Torben Teepe, Fabian Herzog, Gerhard Rigoll
Recent work even suggests that detectors' confidence predictions are biased with respect to object size and position, but it is still unclear how this bias relates to the performance of the affected object detectors.
1 code implementation • 23 Aug 2021 • Martin Knoche, Stefan Hörmann, Gerhard Rigoll
Real-world face recognition applications often deal with suboptimal image quality or resolution due to different capturing conditions such as various subject-to-camera distances, poor camera settings, or motion blur.
1 code implementation • 8 Jul 2021 • Martin Knoche, Stefan Hörmann, Gerhard Rigoll
In this work, we first analyze the impact of image resolutions on face verification performance with a state-of-the-art face recognition model.
1 code implementation • 11 Jun 2021 • Stefan Hörmann, Zeyuan Zhang, Martin Knoche, Torben Teepe, Gerhard Rigoll
In this paper, we propose a novel approach to partial face recognition capable of recognizing faces with different occluded areas.
1 code implementation • ICCV 2021 • Okan Köpüklü, Maja Taseska, Gerhard Rigoll
Successful active speaker detection requires a three-stage pipeline: (i) audio-visual encoding for all speakers in the clip, (ii) inter-speaker relation modeling between a reference speaker and the background speakers within each frame, and (iii) temporal modeling for the reference speaker.
Active Speaker Detection
Audio-Visual Active Speaker Detection
no code implementations • 3 Apr 2021 • Lujun Li, Yikai Kang, Yuchen Shi, Ludwig Kürzinger, Tobias Watzel, Gerhard Rigoll
Inspired by the extensive applications of the generative adversarial networks (GANs) in speech enhancement and ASR tasks, we propose an adversarial joint training framework with the self-attention mechanism to boost the noise robustness of the ASR system.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
2 code implementations • 27 Jan 2021 • Torben Teepe, Ali Khan, Johannes Gilg, Fabian Herzog, Stefan Hörmann, Gerhard Rigoll
However, silhouette images can lose fine-grained spatial information, and most papers do not regard how to obtain these silhouettes in complex scenes.
Ranked #12 on
Multiview Gait Recognition
on CASIA-B
2 code implementations • 26 Jan 2021 • Fabian Herzog, Xunbo Ji, Torben Teepe, Stefan Hörmann, Johannes Gilg, Gerhard Rigoll
Person Re-Identification aims to retrieve person identities from images captured by multiple cameras or the same cameras in different time instances and locations.
Ranked #1 on
Person Re-Identification
on Market-1501
(mINP metric)
no code implementations • 15 Oct 2020 • Ludwig Kürzinger, Nicolas Lindae, Palle Klewitz, Gerhard Rigoll
For this, we propose Lightweight Sinc-Convolutions (LSC) that integrate Sinc-convolutions with depthwise convolutions as a low-parameter machine-learnable feature extraction for end-to-end ASR systems.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
1 code implementation • 30 Sep 2020 • Okan Köpüklü, Stefan Hörmann, Fabian Herzog, Hakan Cevikalp, Gerhard Rigoll
Convolutional Neural Networks with 3D kernels (3D-CNNs) currently achieve state-of-the-art results in video recognition tasks due to their supremacy in extracting spatiotemporal features within video frames.
1 code implementation • 30 Sep 2020 • Okan Köpüklü, Jiapeng Zheng, Hang Xu, Gerhard Rigoll
For this task, we introduce a new video-based benchmark, the Driver Anomaly Detection (DAD) dataset, which contains normal driving videos together with a set of anomalous actions in its training set.
1 code implementation • 25 Jul 2020 • Iustina Andronic, Ludwig Kürzinger, Edgar Ricardo Chavez Rosas, Gerhard Rigoll, Bernhard U. Seeber
The present work proposes MP3 compression as a means to decrease the impact of Adversarial Noise (AN) in audio samples transcribed by ASR systems.
Audio and Speech Processing Cryptography and Security Sound
12 code implementations • 17 Jul 2020 • Ludwig Kürzinger, Dominik Winkelbauer, Lujun Li, Tobias Watzel, Gerhard Rigoll
In this work, we combine freely available corpora for German speech recognition, including yet unlabeled speech data, to a big dataset of over $1700$h of speech data.
Ranked #5 on
Speech Recognition
on TUDA
(using extra training data)
Speech Recognition
Audio and Speech Processing
no code implementations • 15 Jun 2020 • Tobias Watzel, Ludwig Kürzinger, Lujun Li, Gerhard Rigoll
Nowadays, attention models are one of the popular candidates for speech recognition.
no code implementations • 2 Jun 2020 • Stefan Hörmann, Martin Knoche, Gerhard Rigoll
Approaches for kinship verification often rely on cosine distances between face identification features.
no code implementations • 2 Mar 2020 • Okan Köpüklü, Thomas Ledwon, Yao Rong, Neslihan Kose, Gerhard Rigoll
In this work, we propose an HCI system for dynamic recognition of driver micro hand gestures, which can have a crucial impact in automotive sector especially for safety related issues.
1 code implementation • 10 Dec 2019 • Mert Kayhan, Okan Köpüklü, Mhd Hasan Sarhan, Mehmet Yigitsoy, Abouzar Eslami, Gerhard Rigoll
To this end, a lightweight network architecture is introduced and mean teacher, virtual adversarial training and pseudo-labeling algorithms are evaluated on 2D-pose estimation for surgical instruments.
5 code implementations • 15 Nov 2019 • Okan Köpüklü, Xiangyu Wei, Gerhard Rigoll
YOWO is a single-stage architecture with two branches to extract temporal and spatial information concurrently and predict bounding boxes and action probabilities directly from video clips in one evaluation.
Ranked #1 on
Action Recognition In Videos
on AVA v2.2
1 code implementation • 5 Nov 2019 • Simon Mittermaier, Ludwig Kürzinger, Bernd Waschneck, Gerhard Rigoll
Keyword Spotting (KWS) enables speech-based user interaction on smart devices.
1 code implementation • arXiv preprint 2019 • Okan Köpüklü, Fabian Herzog, Gerhard Rigoll
Understanding actions and gestures in video streams requires temporal reasoning of the spatial content from different time instants, i. e., spatiotemporal (ST) modeling.
Ranked #117 on
Action Recognition
on Something-Something V2
no code implementations • 18 Jul 2019 • Neslihan Kose, Okan Kopuklu, Alexander Unnervik, Gerhard Rigoll
Experiments show that our approach outperforms the state-of-the art results on the Distracted Driver Dataset (96. 31%), with an accuracy of 99. 10% for 10-class classification while providing real-time performance.
no code implementations • 12 May 2019 • Mohammadreza Babaee, David Full, Gerhard Rigoll
Video representation is a key challenge in many computer vision applications such as video classification, video captioning, and video surveillance.
no code implementations • 10 May 2019 • Okan Köpüklü, Yao Rong, Gerhard Rigoll
The use of hand gestures provides a natural alternative to cumbersome interface devices for Human-Computer Interaction (HCI) systems.
2 code implementations • 4 Apr 2019 • Okan Köpüklü, Neslihan Kose, Ahmet Gunduz, Gerhard Rigoll
Recently, convolutional neural networks with 3D kernels (3D CNNs) have been very popular in computer vision community as a result of their superior ability of extracting spatio-temporal features within video frames compared to 2D CNNs.
Ranked #2 on
Action Recognition In Videos
on UCF101
5 code implementations • 29 Jan 2019 • Okan Köpüklü, Ahmet Gunduz, Neslihan Kose, Gerhard Rigoll
We evaluate our architecture on two publicly available datasets - EgoGesture and NVIDIA Dynamic Hand Gesture Datasets - which require temporal detection and classification of the performed hand gestures.
Ranked #1 on
Hand Gesture Recognition
on EgoGesture
1 code implementation • 28 Jan 2019 • Okan Köpüklü, Maryam Babaee, Stefan Hörmann, Gerhard Rigoll
In this paper, we propose a CNN architecture, Layer Reuse Network (LruNet), where the convolutional layers are used repeatedly without the need of introducing new layers to get a better performance.
no code implementations • 9 Nov 2018 • Maryam Babaee, Ali Athar, Gerhard Rigoll
To this end, tracklet re-identification is performed by utilizing a novel multi-stage deep network that can jointly reason about the visual appearance and spatio-temporal properties of a pair of tracklets, thereby providing a robust measure of affinity.
no code implementations • CVPR 2018 • Daniel Merget, Matthias Rock, Gerhard Rigoll
While fully-convolutional neural networks are very strong at modeling local features, they fail to aggregate global context due to their constrained receptive field.
no code implementations • 23 Apr 2018 • Maryam Babaee, Linwei Li, Gerhard Rigoll
In gait recognition, normally, gait feature such as Gait Energy Image (GEI) is extracted from one full gait cycle.
1 code implementation • 19 Apr 2018 • Okan Köpüklü, Neslihan Köse, Gerhard Rigoll
Acquiring spatio-temporal states of an action is the most crucial step for action classification.
Ranked #1 on
Hand Gesture Recognition
on ChaLean test
no code implementations • 6 Feb 2017 • Mohammadreza Babaee, Duc Tung Dinh, Gerhard Rigoll
In this work, we present a novel background subtraction system that uses a deep Convolutional Neural Network (CNN) to perform the segmentation.
no code implementations • 15 Dec 2014 • Felix Weninger, Björn Schuller, Florian Eyben, Martin Wöllmer, Gerhard Rigoll
Transcription of broadcast news is an interesting and challenging application for large-vocabulary continuous speech recognition (LVCSR).
no code implementations • 11 Jun 2014 • Jürgen T. Geiger, Maximilian Kneißl, Björn Schuller, Gerhard Rigoll
The goal of the system is to analyse sounds emitted by walking persons (mostly the step sounds) and identify those persons.
no code implementations • CVPR 2013 • Martin Hofmann, Daniel Wolf, Gerhard Rigoll
We generalize the network flow formulation for multiobject tracking to multi-camera setups.