no code implementations • 4 Jan 2025 • Yahya Sowti Khiabani, Farris Atif, Chieh Hsu, Sven Stahlmann, Tobias Michels, Sebastian Kramer, Benedikt Heidrich, M. Saquib Sarfraz, Julian Merten, Faezeh Tafazzoli
Our results demonstrate the potential of SLMs to transform vehicle control systems, enabling more intuitive interactions between users and their vehicles for an enhanced driving experience.
1 code implementation • 31 Oct 2024 • David Schneider, Simon Reiß, Marco Kugler, Alexander Jaus, Kunyu Peng, Susanne Sutschet, M. Saquib Sarfraz, Sven Matthiesen, Rainer Stiefelhagen
In this work, we address this issue by establishing Muscles in Time (MinT), a large-scale synthetic muscle activation dataset.
1 code implementation • 26 Sep 2024 • Kunyu Peng, Di Wen, Kailun Yang, Ao Luo, Yufan Chen, Jia Fu, M. Saquib Sarfraz, Alina Roitberg, Rainer Stiefelhagen
In this paper, we observe that an adaptive domain scheduler benefits more in OSDG compared with prefixed sequential and random domain schedulers.
1 code implementation • 25 Sep 2024 • Lukas Heine, Fabian Hörst, Jana Fragemann, Gijs Luijten, Jan Egger, Fin Bahnsen, M. Saquib Sarfraz, Jens Kleesiek, Constantin Seibold
In industries such as healthcare, finance, and manufacturing, analysis of unstructured textual data presents significant challenges for analysis and decision making.
1 code implementation • 2 Jul 2024 • Kunyu Peng, Jia Fu, Kailun Yang, Di Wen, Yufan Chen, Ruiping Liu, Junwei Zheng, Jiaming Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg
Since these existing methods underperform on RAVAR, we introduce RefAtomNet -- a novel cross-stream attention-driven method specialized for the unique challenges of RAVAR: the need to interpret a textual referring expression for the targeted individual, utilize this reference to guide the spatial localization and harvest the prediction of the atomic actions for the referring person.
1 code implementation • CVPR 2024 • Muhammad Sohail Danish, Muhammad Haris Khan, Muhammad Akhtar Munir, M. Saquib Sarfraz, Mohsen Ali
The results consistently demonstrate the superiority of our approach compared to existing methods.
no code implementations • 22 May 2024 • Omar Moured, Jiaming Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen
Chart summarization is a crucial task for blind and visually impaired individuals as it is their primary means of accessing and interpreting graphical data.
1 code implementation • 4 May 2024 • M. Saquib Sarfraz, Mei-Yen Chen, Lukas Layer, Kunyu Peng, Marios Koulakis
The current state of machine learning scholarship in Timeseries Anomaly Detection (TAD) is plagued by the persistent use of flawed evaluation metrics, inconsistent benchmarking practices, and a lack of proper justification for the choices made in novel deep learning-based model designs.
1 code implementation • 30 Jan 2024 • Ruiping Liu, Jiaming Zhang, Kunyu Peng, Yufan Chen, Ke Cao, Junwei Zheng, M. Saquib Sarfraz, Kailun Yang, Rainer Stiefelhagen
Integrating information from multiple modalities enhances the robustness of scene perception systems in autonomous vehicles, providing a more comprehensive and reliable sensory framework.
1 code implementation • 11 Dec 2023 • Kunyu Peng, Cheng Yin, Junwei Zheng, Ruiping Liu, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg
In real-world scenarios, human actions often fall outside the distribution of training data, making it crucial for models to recognize known actions and reject unknown ones.
no code implementations • 8 Nov 2023 • Muhammad Akhtar Munir, Muhammad Haris Khan, M. Saquib Sarfraz, Mohsen Ali
Deep learning based object detectors struggle generalizing to a new target domain bearing significant variations in object and background.
2 code implementations • 15 May 2023 • Kunyu Peng, Di Wen, David Schneider, Jiaming Zhang, Kailun Yang, M. Saquib Sarfraz, Rainer Stiefelhagen, Alina Roitberg
In this work, we focus on Few-Shot Domain Adaptation for Activity Recognition (FSDA-AR), which leverages a very small amount of labeled target videos to achieve effective adaptation.
1 code implementation • 2 Mar 2023 • Kunyu Peng, David Schneider, Alina Roitberg, Kailun Yang, Jiaming Zhang, Chen Deng, Kaiyu Zhang, M. Saquib Sarfraz, Rainer Stiefelhagen
To this intent, we provide the MuscleMap dataset featuring >15K video clips with 135 different activities and 20 labeled muscle groups.
no code implementations • 15 Sep 2022 • Muhammad Akhtar Munir, Muhammad Haris Khan, M. Saquib Sarfraz, Mohsen Ali
To this end, we first propose a new, plug-and-play, train-time calibration loss for object detection (coined as TCD).
no code implementations • 14 May 2022 • Constantin Seibold, Simon Reiß, M. Saquib Sarfraz, Rainer Stiefelhagen, Jens Kleesiek
We show that despite using unstructured medical report supervision, we perform on par with direct label supervision through a sophisticated inference setting.
Ranked #2 on Thoracic Disease Classification on ChestX-ray14
1 code implementation • CVPR 2022 • M. Saquib Sarfraz, Marios Koulakis, Constantin Seibold, Rainer Stiefelhagen
Dimensionality reduction is crucial both for visualization and preprocessing high dimensional data for machine learning.
Ranked #3 on Data Augmentation on GA1457
no code implementations • 1 Oct 2021 • Muhammad Akhtar Munir, Muhammad Haris Khan, M. Saquib Sarfraz, Mohsen Ali
In this paper, we propose to leverage model predictive uncertainty to strike the right balance between adversarial feature alignment and class-level alignment.
1 code implementation • CVPR 2021 • M. Saquib Sarfraz, Naila Murray, Vivek Sharma, Ali Diba, Luc van Gool, Rainer Stiefelhagen
Action segmentation refers to inferring boundaries of semantically consistent visual concepts in videos and is an important requirement for many video understanding tasks.
Ranked #1 on Action Segmentation on MPII Cooking 2 Dataset
1 code implementation • 19 Aug 2020 • Alexander Wolpert, Michael Teutsch, M. Saquib Sarfraz, Rainer Stiefelhagen
In this way, we can both simplify the network architecture and achieve higher detection performance, especially for pedestrians under occlusion or at low object resolution.
no code implementations • 5 Apr 2020 • Vivek Sharma, Makarand Tapaswi, M. Saquib Sarfraz, Rainer Stiefelhagen
We demonstrate our method on the challenging task of learning representations for video face clustering.
1 code implementation • 1 Aug 2019 • M. Saquib Sarfraz, Constantin Seibold, Haroon Khalid, Rainer Stiefelhagen
In this paper, we propose a novel method of computing the loss directly between the source and target images that enable proper distillation of shape/content and colour/style.
1 code implementation • 3 Mar 2019 • Vivek Sharma, Makarand Tapaswi, M. Saquib Sarfraz, Rainer Stiefelhagen
In this paper, we address video face clustering using unsupervised methods.
1 code implementation • 28 Feb 2019 • M. Saquib Sarfraz, Vivek Sharma, Rainer Stiefelhagen
We present a new clustering method in the form of a single clustering equation that is able to directly discover groupings in the data.
2 code implementations • CVPR 2018 • M. Saquib Sarfraz, Arne Schumann, Andreas Eberle, Rainer Stiefelhagen
In contrast to the recent direction of explicitly modeling body parts or correcting for misalignment based on these, we show that a rather straightforward inclusion of acquired camera view and/or the detected joint locations into a convolutional neural network helps to learn a very effective representation.
Ranked #17 on Person Re-Identification on MARS
no code implementations • 19 Jul 2017 • M. Saquib Sarfraz, Arne Schumann, Yan Wang, Rainer Stiefelhagen
The visual cues hinting at attributes can be strongly localized and inference of person attributes such as hair, backpack, shorts, etc., are highly dependent on the acquired view of the pedestrian.
no code implementations • 20 Jan 2016 • M. Saquib Sarfraz, Rainer Stiefelhagen
Our method bridges the drop in performance due to the modality gap by more than 40\%.
Ranked #2 on Face Recognition on UND-X1
no code implementations • 10 Jul 2015 • M. Saquib Sarfraz, Rainer Stiefelhagen
Cross modal face matching between the thermal and visible spectrum is a much de- sired capability for night-time surveillance and security applications.